Class MinClauseWordsFilter
java.lang.Object
com.kohlschutter.boilerpipe.filters.simple.MinClauseWordsFilter
- All Implemented Interfaces:
BoilerpipeFilter
Keeps only blocks that have at least one segment fragment ("clause") with at least k
words (default: 5).
NOTE: You might consider using the
SplitParagraphBlocksFilter upstream.- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final booleanstatic final MinClauseWordsFilterprivate intprivate final Patternprivate final Pattern -
Constructor Summary
ConstructorsConstructorDescriptionMinClauseWordsFilter(int minWords) MinClauseWordsFilter(int minWords, boolean acceptClausesWithoutDelimiter) -
Method Summary
Modifier and TypeMethodDescriptionprivate booleanisClause(CharSequence text) booleanprocess(TextDocument doc) Processes the given documentdoc.
-
Field Details
-
INSTANCE
-
minWords
private int minWords -
acceptClausesWithoutDelimiter
private final boolean acceptClausesWithoutDelimiter -
PAT_CLAUSE_DELIMITER
-
PAT_WHITESPACE
-
-
Constructor Details
-
MinClauseWordsFilter
public MinClauseWordsFilter(int minWords) -
MinClauseWordsFilter
public MinClauseWordsFilter(int minWords, boolean acceptClausesWithoutDelimiter)
-
-
Method Details
-
process
Description copied from interface:BoilerpipeFilterProcesses the given documentdoc.- Specified by:
processin interfaceBoilerpipeFilter- Parameters:
doc- TheTextDocumentthat is to be processed.- Returns:
trueif changes have been made to theTextDocument.- Throws:
BoilerpipeProcessingException
-
isClause
-