Class DensityRulesClassifier
java.lang.Object
com.kohlschutter.boilerpipe.filters.english.DensityRulesClassifier
- All Implemented Interfaces:
BoilerpipeFilter
Classifies
TextBlocks as content/not-content through rules that have been determined
using the C4.8 machine learning algorithm, as described in the paper
"Boilerplate Detection using Shallow Text Features", particularly using text densities and link
densities.-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected booleanstatic DensityRulesClassifierReturns the singleton instance for RulebasedBoilerpipeClassifier.booleanprocess(TextDocument doc) Processes the given documentdoc.
-
Field Details
-
INSTANCE
-
-
Constructor Details
-
DensityRulesClassifier
public DensityRulesClassifier()
-
-
Method Details
-
getInstance
Returns the singleton instance for RulebasedBoilerpipeClassifier. -
process
Description copied from interface:BoilerpipeFilterProcesses the given documentdoc.- Specified by:
processin interfaceBoilerpipeFilter- Parameters:
doc- TheTextDocumentthat is to be processed.- Returns:
trueif changes have been made to theTextDocument.- Throws:
BoilerpipeProcessingException
-
classify
-