public abstract class ItemFileIndexingWriter extends XMLFileIndexingWriter
Document
for a collection of
item-level metadata records of a specific format (DLESE IMS, ADN-Item, ADN-Collection, etc). The reader
for this type of Document
is XMLDocReader
or ItemDocReader
.
The Lucene Document
fields that are created by this class are (in
addition the the ones listed for FileIndexingServiceWriter
):
title
- The tile for the resource. Stored.
description
- The description for the resource. Stored.
url
- The url to the resoruce. Stored.
Stored. Appended with a '0' at the beginning to support wildcard searching.
metadatapfx
- The metadata prefix (format) for this record, for example 'adn' or
'oai_dc'. Stored. Appended with a '0' at the beginning to support wildcard searching.
accessionstatus
- The accession status for this record. Stored. Appended with a '0'
at the beginning to support wildcard searching.
annotypes
- Annotataion types that are refer to this record. Keyword.
annopathways
- Annotataion pathways that are refer to this record. Keyword.
associatedids
- A list of record IDs that refer to the same resource. Keyword.
valid
- Indicates whether the record is valid [true | false]. Not stored.
validationreport
- Text describing an error in the validation of the data for this
record. Stored. Only indexed if there was a validation error indicated by the valid field containing
false.
ItemDocReader
,
XMLDocReader
,
RecordDataService
,
FileIndexingServiceWriter
Constructor and Description |
---|
ItemFileIndexingWriter() |
Modifier and Type | Method and Description |
---|---|
protected void |
addFields(org.apache.lucene.document.Document newDoc,
org.apache.lucene.document.Document existingDoc,
java.io.File sourceFile)
Adds fields to the index that are common to all item-level documents.
|
protected abstract void |
addFrameworkFields(org.apache.lucene.document.Document newDoc,
org.apache.lucene.document.Document existingDoc)
Adds fields to the index that are unique to the given framework.
|
protected abstract void |
destroy()
This method is called at the conclusion of processing and may be used for tear-down.
|
protected abstract java.util.Date |
getAccessionDate()
Returns the accession date for the item, or null if this item is not accessioned.
|
protected abstract java.lang.String |
getAccessionStatus()
Returns the accession status of this record, for example 'accessioned'.
|
protected abstract MmdRec[] |
getAllMmdRecs()
Returns the MmdRecs for all records associated with this resouce, including myMmdRec.
|
protected abstract MmdRec[] |
getAssociatedMmdRecs()
Returns the MmdRecs for records in other collections that catalog the same resource.
|
protected abstract java.lang.String |
getContent()
Returns the content of the item this record catalogs, or null if not available.
|
protected abstract java.lang.String |
getContentType()
Returns the content type of the item this record catalogs, or null if not available.
|
protected abstract java.util.Date |
getCreationDate()
Returns the date this item was first created, or null if not available.
|
protected abstract java.lang.String |
getCreator()
Returns the items creator's full name.
|
protected abstract java.lang.String |
getCreatorLastName()
Returns the items creator's last name.
|
abstract java.lang.String |
getDocType()
Returns a unique document type key for this kind of record, corresponding to the format type.
|
protected abstract boolean |
getHasRelatedResource()
Returns true if the item has one or more related resource, false otherwise.
|
protected abstract java.lang.String |
getKeywords()
Returns the item's keywords sorted and separated by the '+' symbol.
|
protected ResultDocList |
getMyAnnoResultDocs()
Gets the annotations for this record, null or zero length if none available.
|
protected abstract MmdRec |
getMyMmdRec()
Returns the MmdRec for this record only.
|
abstract java.lang.String |
getReaderClass()
Gets the fully qualified name of the concrete
DocReader class that is
used to read this type of Document , for example
"org.dlese.dpc.index.reader.ItemDocReader". |
protected abstract java.lang.String[] |
getRelatedResourceIds()
Returns the IDs of related resources that are cataloged by ID, or null if none are present
|
protected abstract java.lang.String[] |
getRelatedResourceUrls()
Returns the URLs of related resources that are cataloged by URL, or null if none are present
|
protected abstract java.lang.String |
getValidationReport()
Gets a report detailing any errors found in the validation of the data, or null if no error was found.
|
protected java.util.Date |
getWhatsNewDate()
Returns the date used to determine "What's new" in the library, which is the item's accession date.
|
protected java.lang.String |
getWhatsNewType()
Returns 'itemnew' or 'itemannoinprogress' or 'itemannocomplete' whichever came most recelntly.
|
void |
init(java.io.File source,
org.apache.lucene.document.Document existingDoc)
Initialize the subclasses and record data service data.
|
abstract void |
initItem(java.io.File source,
org.apache.lucene.document.Document existingDoc)
This method is called prior to processing and may be used to for any necessary set-up.
|
_getIds, addCustomFields, getBoundingBox, getCollections, getDeletedDoc, getDescription, getDocGroup, getDom4jDoc, getFieldContent, getFieldContent, getFieldName, getIds, getIndex, getMyCollectionDoc, getOaiModtime, getPrimaryId, getRecordDataService, getRelatedIds, getRelatedIdsMap, getRelatedUrls, getRelatedUrlsMap, getTermStringFromStringArray, getTitle, getUrls, getXmlIndexer, getXmlIndexerFieldsConfig, indexFullContentInDefaultAndStems
abortIndexing, addDocToRemove, addToAdminDefaultField, addToDefaultField, create, getConfigAttributes, getDocsource, getFileContent, getFileIndexingPlugin, getFileIndexingService, getLuceneDoc, getPreviousRecordDoc, getSessionAttributes, getSourceDir, getSourceFile, isMakingDeletedDoc, isValidationEnabled, prtln, prtlnErr, setConfigAttributes, setDebug, setFileIndexingPlugin, setFileIndexingService, setIsMakingDeletedDoc, setValidationEnabled
protected abstract java.lang.String getKeywords() throws java.lang.Exception
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract java.lang.String getCreatorLastName() throws java.lang.Exception
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract java.lang.String getCreator() throws java.lang.Exception
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract java.lang.String getAccessionStatus() throws java.lang.Exception
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract java.util.Date getAccessionDate() throws java.lang.Exception
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract java.util.Date getCreationDate() throws java.lang.Exception
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract java.lang.String getContent()
protected abstract MmdRec[] getAssociatedMmdRecs()
protected abstract MmdRec[] getAllMmdRecs()
protected abstract MmdRec getMyMmdRec()
protected abstract java.lang.String getContentType()
protected abstract boolean getHasRelatedResource() throws java.lang.Exception
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract java.lang.String[] getRelatedResourceIds() throws java.lang.Exception
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract java.lang.String[] getRelatedResourceUrls() throws java.lang.Exception
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected abstract void addFrameworkFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc) throws java.lang.Exception
Example code:
protected void addFrameworkFields(Document newDoc, Document existingDoc) throws Exception {
String customContent = "Some content";
newDoc.add(new Field("mycustomefield", customContent));
}
newDoc
- The new Document
that is being created for this
resourceexistingDoc
- An existing Document
that currently resides in
the index for the given resource, or null if none was previously presentjava.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.public abstract java.lang.String getDocType() throws java.lang.Exception
StandardAnalyzer
so it must be lowercase and should not contain any
stop words.getDocType
in interface DocWriter
getDocType
in class FileIndexingServiceWriter
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.public abstract java.lang.String getReaderClass()
DocReader
class that is
used to read this type of Document
, for example
"org.dlese.dpc.index.reader.ItemDocReader".getReaderClass
in interface DocWriter
getReaderClass
in class FileIndexingServiceWriter
DocReader
.public abstract void initItem(java.io.File source, org.apache.lucene.document.Document existingDoc) throws java.lang.Exception
source
- The source file being indexedexistingDoc
- An existing Document that currently resides in the index for the given resource, or
null if none was previously presentjava.lang.Exception
- If an error occured during set-up.protected abstract void destroy()
destroy
in class FileIndexingServiceWriter
protected abstract java.lang.String getValidationReport() throws java.lang.Exception
XMLFileIndexingWriter.getTitle()
, addFrameworkFields(Document, Document)
, etc.) so that data
verification can be done during those calls, if needed.getValidationReport
in class FileIndexingServiceWriter
java.lang.Exception
- If error in performing the validation.public void init(java.io.File source, org.apache.lucene.document.Document existingDoc) throws java.lang.Exception
init
in class XMLFileIndexingWriter
source
- The source file being indexed.existingDoc
- A Document that previously existed in the index for this item, if presentjava.lang.Exception
- Thrown if error reading the XML mapprotected ResultDocList getMyAnnoResultDocs() throws java.lang.Exception
getMyAnnoResultDocs
in class XMLFileIndexingWriter
java.lang.Exception
- If errorprotected final void addFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc, java.io.File sourceFile) throws java.lang.Exception
addFields
in class XMLFileIndexingWriter
newDoc
- The new Document that is being created for this resourceexistingDoc
- An existing Document that currently resides in the index for the given resource, or
null if none was previously presentsourceFile
- The sourceFile that is being indexed.java.lang.Exception
- If an error occursprotected java.util.Date getWhatsNewDate() throws java.lang.Exception
getWhatsNewDate
in class XMLFileIndexingWriter
java.lang.Exception
- This method should throw and Exception with appropriate error message if an error
occurs.protected java.lang.String getWhatsNewType() throws java.lang.Exception
getWhatsNewType
in class XMLFileIndexingWriter
java.lang.Exception
- If error getting whats new type.