Class TerminatingBlocksFinder
- java.lang.Object
-
- com.kohlschutter.boilerpipe.filters.english.TerminatingBlocksFinder
-
- All Implemented Interfaces:
BoilerpipeFilter
public class TerminatingBlocksFinder extends java.lang.Object implements BoilerpipeFilter
Finds blocks which are potentially indicating the end of an article text and marks them withDefaultLabels.INDICATES_END_OF_TEXT. This can be used in conjunction with a downstreamIgnoreBlocksAfterContentFilter.- See Also:
IgnoreBlocksAfterContentFilter
-
-
Field Summary
Fields Modifier and Type Field Description static TerminatingBlocksFinderINSTANCE
-
Constructor Summary
Constructors Constructor Description TerminatingBlocksFinder()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static TerminatingBlocksFindergetInstance()Returns the singleton instance for TerminatingBlocksFinder.private static booleanisDigit(char c)booleanprocess(TextDocument doc)Processes the given documentdoc.private static booleanstartsWithNumber(java.lang.String t, int len, java.lang.String... str)Checks whether the given text t starts with a sequence of digits, followed by one of the given strings.
-
-
-
Field Detail
-
INSTANCE
public static final TerminatingBlocksFinder INSTANCE
-
-
Method Detail
-
getInstance
public static TerminatingBlocksFinder getInstance()
Returns the singleton instance for TerminatingBlocksFinder.
-
process
public boolean process(TextDocument doc) throws BoilerpipeProcessingException
Description copied from interface:BoilerpipeFilterProcesses the given documentdoc.- Specified by:
processin interfaceBoilerpipeFilter- Parameters:
doc- TheTextDocumentthat is to be processed.- Returns:
trueif changes have been made to theTextDocument.- Throws:
BoilerpipeProcessingException
-
startsWithNumber
private static boolean startsWithNumber(java.lang.String t, int len, java.lang.String... str)Checks whether the given text t starts with a sequence of digits, followed by one of the given strings.- Parameters:
t- The text to examinelen- The length of the text to examinestr- Any strings that may follow the digits.- Returns:
- true if at least one combination matches
-
isDigit
private static boolean isDigit(char c)
-
-