Class MinClauseWordsFilter

java.lang.Object
com.kohlschutter.boilerpipe.filters.simple.MinClauseWordsFilter
All Implemented Interfaces:
BoilerpipeFilter

public final class MinClauseWordsFilter extends Object implements BoilerpipeFilter
Keeps only blocks that have at least one segment fragment ("clause") with at least k words (default: 5). NOTE: You might consider using the SplitParagraphBlocksFilter upstream.
See Also:
  • Field Details

    • INSTANCE

      public static final MinClauseWordsFilter INSTANCE
    • minWords

      private int minWords
    • acceptClausesWithoutDelimiter

      private final boolean acceptClausesWithoutDelimiter
    • PAT_CLAUSE_DELIMITER

      private final Pattern PAT_CLAUSE_DELIMITER
    • PAT_WHITESPACE

      private final Pattern PAT_WHITESPACE
  • Constructor Details

    • MinClauseWordsFilter

      public MinClauseWordsFilter(int minWords)
    • MinClauseWordsFilter

      public MinClauseWordsFilter(int minWords, boolean acceptClausesWithoutDelimiter)
  • Method Details