Class MinFulltextWordsFilter
java.lang.Object
com.kohlschutter.boilerpipe.filters.english.HeuristicFilterBase
com.kohlschutter.boilerpipe.filters.english.MinFulltextWordsFilter
- All Implemented Interfaces:
BoilerpipeFilter
Keeps only those content blocks which contain at least k full-text words (measured by
HeuristicFilterBase.getNumFullTextWords(TextBlock)). k is 30 by default.-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic MinFulltextWordsFilterbooleanprocess(TextDocument doc) Processes the given documentdoc.Methods inherited from class HeuristicFilterBase
getNumFullTextWords, getNumFullTextWords
-
Field Details
-
DEFAULT_INSTANCE
-
minWords
private final int minWords
-
-
Constructor Details
-
MinFulltextWordsFilter
public MinFulltextWordsFilter(int minWords)
-
-
Method Details
-
getDefaultInstance
-
process
Description copied from interface:BoilerpipeFilterProcesses the given documentdoc.- Specified by:
processin interfaceBoilerpipeFilter- Parameters:
doc- TheTextDocumentthat is to be processed.- Returns:
trueif changes have been made to theTextDocument.- Throws:
BoilerpipeProcessingException
-