Uses of Package
com.kohlschutter.boilerpipe.filters.heuristics
Packages that use com.kohlschutter.boilerpipe.filters.heuristics
Package
Description
These BoilerpipeFilters are pure heuristics.
-
Classes in com.kohlschutter.boilerpipe.filters.heuristics used by com.kohlschutter.boilerpipe.filters.heuristicsClassDescriptionAdds the labels of the preceding block to the current block, optionally adding a prefix.Tries to find TextBlocks that comprise of "article metadata".Fuses adjacent blocks if their distance (in blocks) does not exceed a certain limit.Merges two blocks using some heuristics.Marks all
TextBlocks "content" which are between the headline and the part that has already been marked content, if they are markedDefaultLabels.MIGHT_BE_CONTENT.Keeps the largestTextBlockonly (by the number of words).Fuses adjacent blocks if their labels are equal.Marks all blocks as content that: are on the same tag-level as very likely main content (usually the level of the largest block) have a significant number of words, currently: at least 100Marks nested list-item blocks after the end of the main content.Merges two subsequent blocks if their text densities are equal.Marks trailing headlines (TextBlocks that have the labelDefaultLabels.HEADING) as boilerplate.