Package org.languagetool
Class AnalyzedTokenReadings
- java.lang.Object
-
- org.languagetool.AnalyzedTokenReadings
-
- All Implemented Interfaces:
java.lang.Iterable<AnalyzedToken>
public final class AnalyzedTokenReadings extends java.lang.Object implements java.lang.Iterable<AnalyzedToken>
An array ofAnalyzedTokens used to store multiple POS tags and lemmas for a given single token.
-
-
Field Summary
Fields Modifier and Type Field Description private AnalyzedToken[]anTokReadingsprivate java.util.List<ChunkTag>chunkTagsprivate booleanhasSameLemmasprivate java.lang.StringhistoricalAnnotationsprivate booleanisIgnoredBySpellerprivate booleanisImmunizedprivate booleanisLinebreakprivate booleanisParaEndprivate booleanisPosTagUnknownprivate booleanisSentEndprivate booleanisSentStartprivate booleanisWhitespaceprivate booleanisWhitespaceBeforeprivate static java.util.regex.PatternNON_WORD_REGEXprivate intstartPosprivate java.lang.Stringtokenprivate java.lang.StringwhitespaceBeforeChar
-
Constructor Summary
Constructors Constructor Description AnalyzedTokenReadings(java.util.List<AnalyzedToken> tokens, int startPos)AnalyzedTokenReadings(AnalyzedToken token)AnalyzedTokenReadings(AnalyzedToken[] tokens, int startPos)AnalyzedTokenReadings(AnalyzedToken token, int startPos)AnalyzedTokenReadings(AnalyzedTokenReadings oldAtr, java.util.List<AnalyzedToken> newReadings, java.lang.String ruleApplied)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private voidaddHistoricalAnnotations(java.lang.String oldValue, java.lang.String ruleApplied)voidaddReading(AnalyzedToken token, java.lang.String ruleApplied)Add a new reading.private booleanareLemmasSame()Used to configure the internal variable for lemma equality.booleanequals(java.lang.Object obj)AnalyzedTokengetAnalyzedToken(int idx)Get a token reading.java.util.List<ChunkTag>getChunkTags()intgetEndPos()java.lang.StringgetHistoricalAnnotations()Used to track disambiguator actions.java.util.List<AnalyzedToken>getReadings()intgetReadingsLength()Number of readings.intgetStartPos()java.lang.StringgetToken()java.lang.StringgetWhitespaceBefore()booleanhasAnyLemma(java.lang.String... lemmas)Checks if one of the token's readings has one of the given lemmasbooleanhasAnyPartialPosTag(java.lang.String... posTags)Checks if the token has any of the given particular POS tags (only a part of the given POS tag needs to match)inthashCode()booleanhasLemma(java.lang.String lemma)Checks if one of the token's readings has a particular lemma.booleanhasPartialPosTag(java.lang.String posTag)Checks if the token has a particular POS tag, where only a part of the given POS tag needs to match.booleanhasPosTag(java.lang.String posTag)Checks if the token has a particular POS tag.booleanhasPosTagAndLemma(java.lang.String posTag, java.lang.String lemma)Checks if the token has a particular POS tag and lemma.booleanhasPosTagStartingWith(java.lang.String posTag)Checks if the token has a POS tag starting with the given string.booleanhasReading()Checks if there is at least one POS tagbooleanhasSameLemmas()Used to optimize pattern matching.voidignoreSpelling()Make the token ignored by all spelling rules.voidimmunize()booleanisFieldCode()booleanisIgnoredBySpeller()Test if the token can be ignored by spelling rules.booleanisImmunized()booleanisLinebreak()Returns true if the token equals\n,\r,\n\r, or\r\n.booleanisNonWord()booleanisParagraphEnd()booleanisPosTagUnknown()Test if the token's POStag equals null.booleanisSentenceEnd()booleanisSentenceStart()booleanisTagged()booleanisWhitespace()booleanisWhitespaceBefore()java.util.Iterator<AnalyzedToken>iterator()voidleaveReading(AnalyzedToken token)Removes all readings but the one that matches the token given.booleanmatchesPosTagRegex(java.lang.String posTagRegex)Checks if at least one of the readings matches a given POS tag regex.voidremoveReading(AnalyzedToken token, java.lang.String ruleApplied)Removes a reading from the list of readings.voidsetChunkTags(java.util.List<ChunkTag> chunkTags)private voidsetHistoricalAnnotations(java.lang.String historicalAnnotations)Used to track disambiguator actions.private voidsetNoRealPOStag()Sets the flag on AnalyzedTokens to make matching onUNKNOWNPOS tag correct in the Element class.voidsetParagraphEnd()Add a reading with a paragraph end token unless this is already a paragraph end.voidsetSentEnd()Add a SENT_END tag.voidsetStartPos(int position)voidsetWhitespaceBefore(java.lang.String prevToken)java.lang.StringtoString()
-
-
-
Field Detail
-
NON_WORD_REGEX
private static final java.util.regex.Pattern NON_WORD_REGEX
-
isWhitespace
private final boolean isWhitespace
-
isLinebreak
private final boolean isLinebreak
-
isSentStart
private final boolean isSentStart
-
anTokReadings
private AnalyzedToken[] anTokReadings
-
startPos
private int startPos
-
token
private java.lang.String token
-
chunkTags
private java.util.List<ChunkTag> chunkTags
-
isSentEnd
private boolean isSentEnd
-
isParaEnd
private boolean isParaEnd
-
isWhitespaceBefore
private boolean isWhitespaceBefore
-
isPosTagUnknown
private boolean isPosTagUnknown
-
whitespaceBeforeChar
private java.lang.String whitespaceBeforeChar
-
isImmunized
private boolean isImmunized
-
isIgnoredBySpeller
private boolean isIgnoredBySpeller
-
historicalAnnotations
private java.lang.String historicalAnnotations
-
hasSameLemmas
private boolean hasSameLemmas
-
-
Constructor Detail
-
AnalyzedTokenReadings
public AnalyzedTokenReadings(AnalyzedToken[] tokens, int startPos)
-
AnalyzedTokenReadings
public AnalyzedTokenReadings(AnalyzedToken token, int startPos)
-
AnalyzedTokenReadings
public AnalyzedTokenReadings(java.util.List<AnalyzedToken> tokens, int startPos)
-
AnalyzedTokenReadings
public AnalyzedTokenReadings(AnalyzedTokenReadings oldAtr, java.util.List<AnalyzedToken> newReadings, java.lang.String ruleApplied)
-
AnalyzedTokenReadings
AnalyzedTokenReadings(AnalyzedToken token)
-
-
Method Detail
-
getReadings
public java.util.List<AnalyzedToken> getReadings()
-
getAnalyzedToken
public AnalyzedToken getAnalyzedToken(int idx)
Get a token reading.
-
hasPosTag
public boolean hasPosTag(java.lang.String posTag)
Checks if the token has a particular POS tag.- Parameters:
posTag- POS tag to look for
-
hasPosTagAndLemma
public boolean hasPosTagAndLemma(java.lang.String posTag, java.lang.String lemma)Checks if the token has a particular POS tag and lemma.- Parameters:
posTag- POS tag and lemma to look for
-
hasReading
public boolean hasReading()
Checks if there is at least one POS tag- Since:
- 4.7
-
hasLemma
public boolean hasLemma(java.lang.String lemma)
Checks if one of the token's readings has a particular lemma.- Parameters:
lemma- lemma POS tag to look for
-
hasAnyLemma
public boolean hasAnyLemma(java.lang.String... lemmas)
Checks if one of the token's readings has one of the given lemmas- Parameters:
lemmas- to look for
-
hasPartialPosTag
public boolean hasPartialPosTag(java.lang.String posTag)
Checks if the token has a particular POS tag, where only a part of the given POS tag needs to match.- Parameters:
posTag- POS tag substring to look for- Since:
- 1.8
-
hasAnyPartialPosTag
public boolean hasAnyPartialPosTag(java.lang.String... posTags)
Checks if the token has any of the given particular POS tags (only a part of the given POS tag needs to match)- Parameters:
posTags- POS tag substring to look for- Since:
- 4.0
-
hasPosTagStartingWith
public boolean hasPosTagStartingWith(java.lang.String posTag)
Checks if the token has a POS tag starting with the given string.- Parameters:
posTag- POS tag substring to look for- Since:
- 4.0
-
matchesPosTagRegex
public boolean matchesPosTagRegex(java.lang.String posTagRegex)
Checks if at least one of the readings matches a given POS tag regex.- Parameters:
posTagRegex- POS tag regular expression to look for- Since:
- 2.9
-
addReading
public void addReading(AnalyzedToken token, java.lang.String ruleApplied)
Add a new reading.- Parameters:
token- new reading, given asAnalyzedToken
-
removeReading
public void removeReading(AnalyzedToken token, java.lang.String ruleApplied)
Removes a reading from the list of readings. Note: if the token has only one reading, then a new reading with an empty POS tag and an empty lemma is created.- Parameters:
token- reading to be removed
-
leaveReading
public void leaveReading(AnalyzedToken token)
Removes all readings but the one that matches the token given.- Parameters:
token- Token to be matched- Since:
- 1.5
-
getReadingsLength
public int getReadingsLength()
Number of readings.
-
isWhitespace
public boolean isWhitespace()
-
isLinebreak
public boolean isLinebreak()
Returns true if the token equals\n,\r,\n\r, or\r\n.
-
isSentenceStart
public boolean isSentenceStart()
- Since:
- 2.3
-
isParagraphEnd
public boolean isParagraphEnd()
- Returns:
- true when the token is a last token in a paragraph.
- Since:
- 2.3
-
setParagraphEnd
public void setParagraphEnd()
Add a reading with a paragraph end token unless this is already a paragraph end.- Since:
- 2.3
-
isSentenceEnd
public boolean isSentenceEnd()
- Returns:
- true when the token is a last token in a sentence.
- Since:
- 2.3
-
isFieldCode
public boolean isFieldCode()
- Returns:
- true if the token is LibreOffice/OpenOffice field code.
- Since:
- 0.9.9
-
setSentEnd
public void setSentEnd()
Add a SENT_END tag.
-
getStartPos
public int getStartPos()
-
getEndPos
public int getEndPos()
- Since:
- 2.9
-
setStartPos
public void setStartPos(int position)
-
getToken
public java.lang.String getToken()
-
setWhitespaceBefore
public void setWhitespaceBefore(java.lang.String prevToken)
-
getWhitespaceBefore
public java.lang.String getWhitespaceBefore()
-
isWhitespaceBefore
public boolean isWhitespaceBefore()
-
immunize
public void immunize()
-
isImmunized
public boolean isImmunized()
-
ignoreSpelling
public void ignoreSpelling()
Make the token ignored by all spelling rules.- Since:
- 2.5
-
isIgnoredBySpeller
public boolean isIgnoredBySpeller()
Test if the token can be ignored by spelling rules.- Returns:
- true if the token should be ignored.
- Since:
- 2.5
-
isPosTagUnknown
public boolean isPosTagUnknown()
Test if the token's POStag equals null.- Returns:
- true if the token does not have a POStag
- Since:
- 3.9
-
setNoRealPOStag
private void setNoRealPOStag()
Sets the flag on AnalyzedTokens to make matching onUNKNOWNPOS tag correct in the Element class.
-
getHistoricalAnnotations
public java.lang.String getHistoricalAnnotations()
Used to track disambiguator actions.- Returns:
- the historicalAnnotations
-
setHistoricalAnnotations
private void setHistoricalAnnotations(java.lang.String historicalAnnotations)
Used to track disambiguator actions.- Parameters:
historicalAnnotations- the historicalAnnotations to set
-
addHistoricalAnnotations
private void addHistoricalAnnotations(java.lang.String oldValue, java.lang.String ruleApplied)
-
setChunkTags
public void setChunkTags(java.util.List<ChunkTag> chunkTags)
- Since:
- 2.3
-
getChunkTags
public java.util.List<ChunkTag> getChunkTags()
- Since:
- 2.3
-
toString
public java.lang.String toString()
- Overrides:
toStringin classjava.lang.Object
-
isTagged
public boolean isTagged()
- Returns:
- true if AnalyzedTokenReadings has some real POS tag (= not null or a special tag)
- Since:
- 2.3
-
areLemmasSame
private boolean areLemmasSame()
Used to configure the internal variable for lemma equality.- Returns:
- true if all
AnalyzedTokenlemmas are the same. - Since:
- 2.5
-
hasSameLemmas
public boolean hasSameLemmas()
Used to optimize pattern matching.- Returns:
- true if all
AnalyzedTokenlemmas are the same.
-
isNonWord
public boolean isNonWord()
- Returns:
- true if AnalyzedTokenReadings is a punctuation mark, bracket, etc
- Since:
- 4.4
-
hashCode
public int hashCode()
- Overrides:
hashCodein classjava.lang.Object
-
equals
public boolean equals(java.lang.Object obj)
- Overrides:
equalsin classjava.lang.Object
-
iterator
public java.util.Iterator<AnalyzedToken> iterator()
- Specified by:
iteratorin interfacejava.lang.Iterable<AnalyzedToken>- Since:
- 2.3
-
-