Uses of Class
com.kohlschutter.boilerpipe.document.TextDocument
-
Packages that use TextDocument Package Description com.kohlschutter.boilerpipe The Boilerpipe top-level package.com.kohlschutter.boilerpipe.document The Boilerpipe document model.com.kohlschutter.boilerpipe.extractors Some standard extractors (i.e., completely piped BoilerpipeFilters)com.kohlschutter.boilerpipe.filters.debug com.kohlschutter.boilerpipe.filters.english These BoilerpipeFilters have only been tested on English text.com.kohlschutter.boilerpipe.filters.heuristics These BoilerpipeFilters are pure heuristics.com.kohlschutter.boilerpipe.filters.simple These BoilerpipeFilters are straight-forward and probably not really specific to English.com.kohlschutter.boilerpipe.sax Classes related to parsing and producing HTML from/to Boilerpipe TextDocuments. -
-
Uses of TextDocument in com.kohlschutter.boilerpipe
Methods in com.kohlschutter.boilerpipe that return TextDocument Modifier and Type Method Description TextDocumentBoilerpipeInput. getTextDocument()Returns (somehow) aTextDocument.TextDocumentBoilerpipeDocumentSource. toTextDocument()Methods in com.kohlschutter.boilerpipe with parameters of type TextDocument Modifier and Type Method Description java.lang.StringBoilerpipeExtractor. getText(TextDocument doc)Extracts text from the givenTextDocumentobject.booleanBoilerpipeFilter. process(TextDocument doc)Processes the given documentdoc. -
Uses of TextDocument in com.kohlschutter.boilerpipe.document
Methods in com.kohlschutter.boilerpipe.document that return TextDocument Modifier and Type Method Description TextDocumentTextDocument. clone()Constructors in com.kohlschutter.boilerpipe.document with parameters of type TextDocument Constructor Description TextDocumentStatistics(TextDocument doc, boolean contentOnly)Computes statistics on a givenTextDocument. -
Uses of TextDocument in com.kohlschutter.boilerpipe.extractors
Methods in com.kohlschutter.boilerpipe.extractors with parameters of type TextDocument Modifier and Type Method Description java.lang.StringExtractorBase. getText(TextDocument doc)Extracts text from the givenTextDocumentobject.booleanArticleExtractor. process(TextDocument doc)booleanArticleSentencesExtractor. process(TextDocument doc)booleanCanolaExtractor. process(TextDocument doc)booleanDefaultExtractor. process(TextDocument doc)booleanKeepEverythingExtractor. process(TextDocument doc)booleanKeepEverythingWithMinKWordsExtractor. process(TextDocument doc)booleanLargestContentExtractor. process(TextDocument doc)booleanNumWordsRulesExtractor. process(TextDocument doc) -
Uses of TextDocument in com.kohlschutter.boilerpipe.filters.debug
Methods in com.kohlschutter.boilerpipe.filters.debug with parameters of type TextDocument Modifier and Type Method Description booleanPrintDebugFilter. process(TextDocument doc) -
Uses of TextDocument in com.kohlschutter.boilerpipe.filters.english
Methods in com.kohlschutter.boilerpipe.filters.english with parameters of type TextDocument Modifier and Type Method Description booleanDensityRulesClassifier. process(TextDocument doc)booleanIgnoreBlocksAfterContentFilter. process(TextDocument doc)booleanIgnoreBlocksAfterContentFromEndFilter. process(TextDocument doc)booleanKeepLargestFulltextBlockFilter. process(TextDocument doc)booleanMinFulltextWordsFilter. process(TextDocument doc)booleanNumWordsRulesClassifier. process(TextDocument doc)booleanTerminatingBlocksFinder. process(TextDocument doc) -
Uses of TextDocument in com.kohlschutter.boilerpipe.filters.heuristics
Methods in com.kohlschutter.boilerpipe.filters.heuristics with parameters of type TextDocument Modifier and Type Method Description booleanAddPrecedingLabelsFilter. process(TextDocument doc)booleanArticleMetadataFilter. process(TextDocument doc)booleanBlockProximityFusion. process(TextDocument doc)booleanContentFusion. process(TextDocument doc)booleanDocumentTitleMatchClassifier. process(TextDocument doc)booleanExpandTitleToContentFilter. process(TextDocument doc)booleanKeepLargestBlockFilter. process(TextDocument doc)booleanLabelFusion. process(TextDocument doc)booleanLargeBlockSameTagLevelToContentFilter. process(TextDocument doc)booleanListAtEndFilter. process(TextDocument doc)booleanSimpleBlockFusionProcessor. process(TextDocument doc)booleanTrailingHeadlineToBoilerplateFilter. process(TextDocument doc) -
Uses of TextDocument in com.kohlschutter.boilerpipe.filters.simple
Methods in com.kohlschutter.boilerpipe.filters.simple with parameters of type TextDocument Modifier and Type Method Description booleanBoilerplateBlockFilter. process(TextDocument doc)booleanInvertedFilter. process(TextDocument doc)booleanLabelToBoilerplateFilter. process(TextDocument doc)booleanLabelToContentFilter. process(TextDocument doc)booleanMarkEverythingBoilerplateFilter. process(TextDocument doc)booleanMarkEverythingContentFilter. process(TextDocument doc)booleanMinClauseWordsFilter. process(TextDocument doc)booleanMinWordsFilter. process(TextDocument doc)booleanSplitParagraphBlocksFilter. process(TextDocument doc)booleanSurroundingToContentFilter. process(TextDocument doc) -
Uses of TextDocument in com.kohlschutter.boilerpipe.sax
Methods in com.kohlschutter.boilerpipe.sax that return TextDocument Modifier and Type Method Description TextDocumentBoilerpipeSAXInput. getTextDocument()Retrieves theTextDocumentusing a default HTML parser.TextDocumentBoilerpipeSAXInput. getTextDocument(BoilerpipeHTMLParser parser)Retrieves theTextDocumentusing the given HTML parser.TextDocumentBoilerpipeHTMLContentHandler. toTextDocument()Returns aTextDocumentcontaining the extractedTextBlocks.TextDocumentBoilerpipeHTMLParser. toTextDocument()Returns aTextDocumentcontaining the extractedTextBlocks.Methods in com.kohlschutter.boilerpipe.sax with parameters of type TextDocument Modifier and Type Method Description (package private) voidHTMLHighlighter.Implementation. process(TextDocument doc, org.xml.sax.InputSource is)java.lang.StringHTMLHighlighter. process(TextDocument doc, java.lang.String origHTML)Processes the givenTextDocumentand the original HTML text (as a String).java.lang.StringHTMLHighlighter. process(TextDocument doc, org.xml.sax.InputSource is)Processes the givenTextDocumentand the original HTML text (as anInputSource).(package private) voidImageExtractor.Implementation. process(TextDocument doc, org.xml.sax.InputSource is)java.util.List<Image>ImageExtractor. process(TextDocument doc, java.lang.String origHTML)Processes the givenTextDocumentand the original HTML text (as a String).java.util.List<Image>ImageExtractor. process(TextDocument doc, org.xml.sax.InputSource is)Processes the givenTextDocumentand the original HTML text (as anInputSource).
-