Class MultiWordChunker
java.lang.Object
org.languagetool.tagging.disambiguation.AbstractDisambiguator
org.languagetool.tagging.disambiguation.MultiWordChunker
- All Implemented Interfaces:
Disambiguator
Multiword tagger-chunker.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionMultiWordChunker(String filename) MultiWordChunker(String filename, boolean allowFirstCapitalized) -
Method Summary
Modifier and TypeMethodDescriptionfinal AnalyzedSentencedisambiguate(AnalyzedSentence input) Implements multiword POS tags, e.g., <ELLIPSIS> for ellipsis (...) start, and </ELLIPSIS> for ellipsis end.private voidlazyInit()loadWords(InputStream stream) private AnalyzedTokenReadingsprepareNewReading(String tokens, String tok, AnalyzedTokenReadings token, boolean isLast) private AnalyzedTokenReadingssetAndAnnotate(AnalyzedTokenReadings oldReading, AnalyzedToken newReading) Methods inherited from class org.languagetool.tagging.disambiguation.AbstractDisambiguator
preDisambiguate
-
Field Details
-
filename
-
allowFirstCapitalized
private final boolean allowFirstCapitalized -
mStartSpace
-
mStartNoSpace
-
mFull
-
-
Constructor Details
-
MultiWordChunker
- Parameters:
filename- file text with multiwords and tags
-
MultiWordChunker
- Parameters:
filename- file text with multiwords and tagsallowFirstCapitalized- if set totrue, first word of the multiword can be capitalized
-
-
Method Details
-
lazyInit
private void lazyInit() -
disambiguate
Implements multiword POS tags, e.g., <ELLIPSIS> for ellipsis (...) start, and </ELLIPSIS> for ellipsis end.- Parameters:
input- The tokens to be chunked.- Returns:
- AnalyzedSentence with additional markers.
-
prepareNewReading
private AnalyzedTokenReadings prepareNewReading(String tokens, String tok, AnalyzedTokenReadings token, boolean isLast) -
setAndAnnotate
private AnalyzedTokenReadings setAndAnnotate(AnalyzedTokenReadings oldReading, AnalyzedToken newReading) -
loadWords
-