Package org.languagetool.tokenizers
Class SRXSentenceTokenizer
java.lang.Object
org.languagetool.tokenizers.SRXSentenceTokenizer
- All Implemented Interfaces:
SentenceTokenizer,Tokenizer
- Direct Known Subclasses:
SimpleSentenceTokenizer
Class to tokenize sentences using rules from an SRX file.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionSRXSentenceTokenizer(Language language) Build a sentence tokenizer based on the rules in thesegment.srxfile that comes with LanguageTool.SRXSentenceTokenizer(Language language, String srxInClassPath) -
Method Summary
Modifier and TypeMethodDescriptionfinal voidsetSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs) final booleanTokenize the given string to sentences.
-
Field Details
-
srxDocument
private final net.loomchild.segment.srx.SrxDocument srxDocument -
language
-
parCode
-
-
Constructor Details
-
SRXSentenceTokenizer
Build a sentence tokenizer based on the rules in thesegment.srxfile that comes with LanguageTool. -
SRXSentenceTokenizer
- Parameters:
srxInClassPath- the path to an SRX file in the classpath- Since:
- 3.2
-
-
Method Details
-
tokenize
Description copied from interface:SentenceTokenizerTokenize the given string to sentences.- Specified by:
tokenizein interfaceSentenceTokenizer- Specified by:
tokenizein interfaceTokenizer
-
singleLineBreaksMarksPara
public final boolean singleLineBreaksMarksPara()- Specified by:
singleLineBreaksMarksParain interfaceSentenceTokenizer
-
setSingleLineBreaksMarksParagraph
public final void setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs) - Specified by:
setSingleLineBreaksMarksParagraphin interfaceSentenceTokenizer- Parameters:
lineBreakParagraphs- iftrue, single lines breaks are assumed to end a paragraph; iffalse, only two ore more consecutive line breaks end a paragraph
-