Package com.kohlschutter.boilerpipe.sax
Class BoilerpipeSAXInput
java.lang.Object
com.kohlschutter.boilerpipe.sax.BoilerpipeSAXInput
- All Implemented Interfaces:
BoilerpipeInput
Parses an
InputSource using SAX and returns a TextDocument.-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionRetrieves theTextDocumentusing a default HTML parser.getTextDocument(BoilerpipeHTMLParser parser) Retrieves theTextDocumentusing the given HTML parser.
-
Field Details
-
is
-
-
Constructor Details
-
BoilerpipeSAXInput
Creates a new instance ofBoilerpipeSAXInputfor the givenInputSource.- Parameters:
is-- Throws:
SAXException
-
-
Method Details
-
getTextDocument
Retrieves theTextDocumentusing a default HTML parser.- Specified by:
getTextDocumentin interfaceBoilerpipeInput- Returns:
- A
TextDocument. - Throws:
BoilerpipeProcessingException
-
getTextDocument
public TextDocument getTextDocument(BoilerpipeHTMLParser parser) throws BoilerpipeProcessingException Retrieves theTextDocumentusing the given HTML parser.- Parameters:
parser- The parser used to transform the input into boilerpipe's internal representation.- Returns:
- The retrieved
TextDocument - Throws:
BoilerpipeProcessingException
-