Class XMLDocumentEnum


public final class XMLDocumentEnum
extends DocumentEnum

An implementation of DocumentEnum used to parse XML files for indexing by the Lucene search engine. There is no need to instantiate this class; it is automatically used by the XMLBackEndLSP. Basically, it uses the doc_style.xsl file to learn how to extract fields from the document. The SDARTS Design Document contains more information on how this works, but the basic process is as follows:

Currently, XSL processing is being carried out by the Apache Xalan XSL processor. All Xalan-related code is confined to this class. A future version may want to hide the Xalan code behind another interface, in order to make it easier to switch to another XSL processor.

Fields inherited from class edu.columbia.cs.sdarts.backend.doc.lucene.DocumentEnum
Constructor Summary
Method Summary
 com.lucene.document.Document createDocument( f, org.omg.CORBA.IntHolder storeTokenCountHere)
          Builds a Lucene Document from an XML file
Methods inherited from class edu.columbia.cs.sdarts.backend.doc.lucene.DocumentEnum
getDocConfig, getDocuments, initialize, isEmpty, makeValue, parseDate
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public XMLDocumentEnum()
Method Detail


public com.lucene.document.Document createDocument( f,
                                                   org.omg.CORBA.IntHolder storeTokenCountHere)
                                            throws BackEndException
Builds a Lucene Document from an XML file
createDocument in class DocumentEnum
Following copied from class: edu.columbia.cs.sdarts.backend.doc.lucene.DocumentEnum
file - the File to turn into a Lucene Document
storeTokenCountHere - an OUT parameter; an implementor of this method should write the number of tokens in the file into the value field of this IntHolder
a Lucene Document generated from the file
BackEndException - if something goes wrong


Sdarts Homepage