Class DutchWordTokenizer
java.lang.Object
org.languagetool.tokenizers.WordTokenizer
org.languagetool.tokenizers.nl.DutchWordTokenizer
- All Implemented Interfaces:
org.languagetool.tokenizers.Tokenizer
public class DutchWordTokenizer
extends org.languagetool.tokenizers.WordTokenizer
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate booleanendsWithQuote(String token) private booleanstartsWithQuote(String token) Tokenizes just like WordTokenizer with the exception for words such as "oma's" that contain an apostrophe in their middle.Methods inherited from class org.languagetool.tokenizers.WordTokenizer
getProtocols, isEMail, isUrl, joinEMails, joinEMailsAndUrls, joinUrls
-
Field Details
-
QUOTES
-
nlTokenizingChars
-
-
Constructor Details
-
DutchWordTokenizer
public DutchWordTokenizer()
-
-
Method Details
-
tokenize
Tokenizes just like WordTokenizer with the exception for words such as "oma's" that contain an apostrophe in their middle.- Specified by:
tokenizein interfaceorg.languagetool.tokenizers.Tokenizer- Overrides:
tokenizein classorg.languagetool.tokenizers.WordTokenizer- Parameters:
text- Text to tokenize- Returns:
- List of tokens
-
startsWithQuote
-
endsWithQuote
-
getTokenizingCharacters
- Overrides:
getTokenizingCharactersin classorg.languagetool.tokenizers.WordTokenizer
-