Class TerminatingBlocksFinder
java.lang.Object
com.kohlschutter.boilerpipe.filters.english.TerminatingBlocksFinder
- All Implemented Interfaces:
BoilerpipeFilter
Finds blocks which are potentially indicating the end of an article text and marks them with
DefaultLabels.INDICATES_END_OF_TEXT. This can be used in conjunction with a downstream
IgnoreBlocksAfterContentFilter.- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic TerminatingBlocksFinderReturns the singleton instance for TerminatingBlocksFinder.private static booleanisDigit(char c) booleanprocess(TextDocument doc) Processes the given documentdoc.private static booleanstartsWithNumber(String t, int len, String... str) Checks whether the given text t starts with a sequence of digits, followed by one of the given strings.
-
Field Details
-
INSTANCE
-
-
Constructor Details
-
TerminatingBlocksFinder
public TerminatingBlocksFinder()
-
-
Method Details
-
getInstance
Returns the singleton instance for TerminatingBlocksFinder. -
process
Description copied from interface:BoilerpipeFilterProcesses the given documentdoc.- Specified by:
processin interfaceBoilerpipeFilter- Parameters:
doc- TheTextDocumentthat is to be processed.- Returns:
trueif changes have been made to theTextDocument.- Throws:
BoilerpipeProcessingException
-
startsWithNumber
Checks whether the given text t starts with a sequence of digits, followed by one of the given strings.- Parameters:
t- The text to examinelen- The length of the text to examinestr- Any strings that may follow the digits.- Returns:
- true if at least one combination matches
-
isDigit
private static boolean isDigit(char c)
-