class |
ArticleExtractor |
A full-text extractor which is tuned towards news articles.
|
class |
ArticleSentencesExtractor |
A full-text extractor which is tuned towards extracting sentences from news articles.
|
class |
CanolaExtractor |
|
class |
DefaultExtractor |
A quite generic full-text extractor.
|
class |
KeepEverythingExtractor |
Marks everything as content.
|
class |
KeepEverythingWithMinKWordsExtractor |
A full-text extractor which extracts the largest text component of a page.
|
class |
LargestContentExtractor |
A full-text extractor which extracts the largest text component of a page.
|
class |
NumWordsRulesExtractor |
A quite generic full-text extractor solely based upon the number of words per block (the current,
the previous and the next block).
|