Package com.kohlschutter.boilerpipe.filters.simple
package com.kohlschutter.boilerpipe.filters.simple
These BoilerpipeFilters are straight-forward and probably not really specific to English.
-
ClassesClassDescriptionRemoves
TextBlocks which have explicitly been marked as "not content".Reverts the "isContent" flag for allTextBlocksMarks all blocks that contain a given label as "boilerplate".Marks all blocks that contain a given label as "content".Marks all blocks as boilerplate.Marks all blocks as content.Keeps only blocks that have at least one segment fragment ("clause") with at least k words (default: 5).Keeps only those content blocks which contain at least k words.Splits TextBlocks at paragraph boundaries.Marks blocks as "content" if their preceding and following blocks are both already marked "content", and the givenTextBlockConditionis met.