public class SimpleXMLFileIndexingWriter extends XMLFileIndexingWriter
Document
from any valid XML file by stripping the XML tags to extract and
index the content. The full content of all Elements and Attributes is indexed in the default and
admindefault fields and is stemmed and indexed in the stems field. The reader for this type of Document is
XMLDocReader.FileIndexingService
,
XMLDocReader
Constructor and Description |
---|
SimpleXMLFileIndexingWriter()
Constructor for the SimpleXMLFileIndexingWriter object
|
Modifier and Type | Method and Description |
---|---|
protected java.lang.String[] |
_getIds()
Returns null to handle by super.
|
protected void |
addFields(org.apache.lucene.document.Document newDoc,
org.apache.lucene.document.Document existingDoc,
java.io.File sourceFile)
Nothing to do here.
|
protected void |
destroy()
Does nothing.
|
java.lang.String |
getDescription()
Gets the description attribute of the SimpleXMLFileIndexingWriter object
|
java.lang.String |
getDocType()
Gets the xml format for this document, for example "oai_dc," "adn," "dlese_ims," or "dlese_anno".
|
java.lang.String |
getReaderClass()
Gets the name of the concrete
DocReader class that is used to read
this type of Document , which is
"org.dlese.dpc.index.reader.XMLDocReader". |
java.lang.String |
getTitle()
Gets the title attribute of the SimpleXMLFileIndexingWriter object
|
java.lang.String[] |
getUrls()
Gets the urls attribute of the SimpleXMLFileIndexingWriter object
|
protected java.lang.String |
getValidationReport()
Gets a report detailing any errors found in the validation of the data, or null if no error was found.
|
protected java.util.Date |
getWhatsNewDate()
Returns the date used to determine "What's new" in the library, which is null (unknown).
|
protected java.lang.String |
getWhatsNewType()
Returns null (unknown).
|
boolean |
indexFullContentInDefaultAndStems()
Place the entire XML content into the default and stems search field.
|
void |
init(java.io.File sourceFile,
org.apache.lucene.document.Document existingDoc)
This method is called prior to processing and may be used to for any necessary set-up.
|
addCustomFields, getBoundingBox, getCollections, getDeletedDoc, getDocGroup, getDom4jDoc, getFieldContent, getFieldContent, getFieldName, getIds, getIndex, getMyAnnoResultDocs, getMyCollectionDoc, getOaiModtime, getPrimaryId, getRecordDataService, getRelatedIds, getRelatedIdsMap, getRelatedUrls, getRelatedUrlsMap, getTermStringFromStringArray, getXmlIndexer, getXmlIndexerFieldsConfig
abortIndexing, addDocToRemove, addToAdminDefaultField, addToDefaultField, create, getConfigAttributes, getDocsource, getFileContent, getFileIndexingPlugin, getFileIndexingService, getLuceneDoc, getPreviousRecordDoc, getSessionAttributes, getSourceDir, getSourceFile, isMakingDeletedDoc, isValidationEnabled, prtln, prtlnErr, setConfigAttributes, setDebug, setFileIndexingPlugin, setFileIndexingService, setIsMakingDeletedDoc, setValidationEnabled
public SimpleXMLFileIndexingWriter()
public java.lang.String getDocType() throws java.lang.Exception
getDocType
in interface DocWriter
getDocType
in class FileIndexingServiceWriter
java.lang.Exception
- If errlr.public java.lang.String getReaderClass()
DocReader
class that is used to read
this type of Document
, which is
"org.dlese.dpc.index.reader.XMLDocReader".getReaderClass
in interface DocWriter
getReaderClass
in class FileIndexingServiceWriter
public void init(java.io.File sourceFile, org.apache.lucene.document.Document existingDoc) throws java.lang.Exception
init
in class XMLFileIndexingWriter
sourceFile
- The sourceFile being indexed.existingDoc
- An existing Document that exists for this in the index.java.lang.Exception
- If errorprotected java.util.Date getWhatsNewDate() throws java.lang.Exception
getWhatsNewDate
in class XMLFileIndexingWriter
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected java.lang.String getWhatsNewType() throws java.lang.Exception
getWhatsNewType
in class XMLFileIndexingWriter
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected void destroy()
destroy
in class FileIndexingServiceWriter
protected java.lang.String getValidationReport() throws java.lang.Exception
getValidationReport
in class FileIndexingServiceWriter
java.lang.Exception
- If error in performing the validation.protected java.lang.String[] _getIds()
_getIds
in class XMLFileIndexingWriter
public java.lang.String[] getUrls()
getUrls
in class XMLFileIndexingWriter
public java.lang.String getDescription()
getDescription
in class XMLFileIndexingWriter
public java.lang.String getTitle()
getTitle
in class XMLFileIndexingWriter
public boolean indexFullContentInDefaultAndStems()
indexFullContentInDefaultAndStems
in class XMLFileIndexingWriter
protected void addFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc, java.io.File sourceFile) throws java.lang.Exception
addFields
in class XMLFileIndexingWriter
newDoc
- The new Document
that is being created for this
resourceexistingDoc
- An existing Document
that currently resides in
the index for the given resource, or null if none was previously presentsourceFile
- The feature to be added to the CustomFields attributejava.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.