Package org.languagetool.rules.patterns
Class Unifier
- java.lang.Object
-
- org.languagetool.rules.patterns.Unifier
-
public class Unifier extends java.lang.ObjectImplements unification of features over tokens.
-
-
Field Summary
Fields Modifier and Type Field Description private booleanallFeatsInprivate java.util.Map<java.lang.String,java.util.List<java.lang.String>>equivalenceFeaturesA Map that stores all possible equivalence types listed for features.private java.util.List<java.util.Map<java.lang.String,java.util.Set<java.lang.String>>>equivalencesMatchedMap of sets of matched equivalences in the unified sequence.private java.util.Map<java.lang.String,java.util.Set<java.lang.String>>equivalencesToBeKeptprivate java.util.Map<EquivalenceTypeLocator,PatternToken>equivalenceTypesA Map for storing the equivalence types for features.private java.util.List<java.lang.Boolean>featuresFoundprivate booleaninUnificationprivate intreadingsCounterprivate java.util.List<java.lang.Boolean>tmpFeaturesFoundprivate inttokCntprivate java.util.List<AnalyzedTokenReadings>tokSequenceprivate java.util.List<java.util.List<java.util.Map<java.lang.String,java.util.Set<java.lang.String>>>>tokSequenceEquivalencesList of all equivalences matched per tokens in the sequence, kept exactly in sync with the list in tokSequence, so that a reading 2 of token 1 has its equivalence map addressable as tokSequenceEquivalences.get(1).get(2).private booleanuniAllMatchedprivate java.util.Map<java.lang.String,java.util.List<java.lang.String>>unificationFeatsprivate static java.lang.StringUNIFY_IGNOREprivate booleanuniMatched
-
Constructor Summary
Constructors Constructor Description Unifier(java.util.Map<EquivalenceTypeLocator,PatternToken> equivalenceTypes, java.util.Map<java.lang.String,java.util.List<java.lang.String>> equivalenceFeatures)Instantiates the unifier.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddNeutralElement(AnalyzedTokenReadings analyzedTokenReadings)Used to add neutral elements (AnalyzedTokenReadingsto the unified sequence.private voidaddTokenToSequence(java.util.List<AnalyzedTokenReadings> tokenSequence, AnalyzedToken token, int pos)private booleancheckNext(AnalyzedToken aToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)booleangetFinalUnificationValue(java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)Make sure that we really matched all the required features of the unification.@Nullable AnalyzedTokenReadings[]getFinalUnified()Used for getting a unified sequence in case when simple test methodisUnified(AnalyzedToken, Map, boolean)} was used.@Nullable AnalyzedTokenReadings[]getUnifiedTokens()Gets a full sequence of filtered tokens.protected booleanisSatisfied(AnalyzedToken aToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)Tests if a token has shared features with other tokens.booleanisUnified(AnalyzedToken matchToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures, boolean lastReading)booleanisUnified(AnalyzedToken matchToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures, boolean lastReading, boolean isMatched)Tests if the token sequence is unified.voidreset()Resets after use of unification.voidstartNextToken()Call after every complete token (AnalyzedTokenReadings) checked.voidstartUnify()Starts testing only those equivalences that were previously matched.
-
-
-
Field Detail
-
UNIFY_IGNORE
private static final java.lang.String UNIFY_IGNORE
- See Also:
- Constant Field Values
-
tokSequence
private final java.util.List<AnalyzedTokenReadings> tokSequence
-
tokSequenceEquivalences
private final java.util.List<java.util.List<java.util.Map<java.lang.String,java.util.Set<java.lang.String>>>> tokSequenceEquivalences
List of all equivalences matched per tokens in the sequence, kept exactly in sync with the list in tokSequence, so that a reading 2 of token 1 has its equivalence map addressable as tokSequenceEquivalences.get(1).get(2).
-
equivalenceTypes
private final java.util.Map<EquivalenceTypeLocator,PatternToken> equivalenceTypes
A Map for storing the equivalence types for features. Features are specified as Strings, and map into types defined as maps from Strings to Elements.
-
equivalenceFeatures
private final java.util.Map<java.lang.String,java.util.List<java.lang.String>> equivalenceFeatures
A Map that stores all possible equivalence types listed for features.
-
equivalencesMatched
private final java.util.List<java.util.Map<java.lang.String,java.util.Set<java.lang.String>>> equivalencesMatched
Map of sets of matched equivalences in the unified sequence.
-
allFeatsIn
private boolean allFeatsIn
-
tokCnt
private int tokCnt
-
readingsCounter
private int readingsCounter
-
featuresFound
private java.util.List<java.lang.Boolean> featuresFound
-
tmpFeaturesFound
private java.util.List<java.lang.Boolean> tmpFeaturesFound
-
equivalencesToBeKept
private final java.util.Map<java.lang.String,java.util.Set<java.lang.String>> equivalencesToBeKept
-
unificationFeats
private java.util.Map<java.lang.String,java.util.List<java.lang.String>> unificationFeats
-
inUnification
private boolean inUnification
-
uniMatched
private boolean uniMatched
-
uniAllMatched
private boolean uniAllMatched
-
-
Constructor Detail
-
Unifier
public Unifier(java.util.Map<EquivalenceTypeLocator,PatternToken> equivalenceTypes, java.util.Map<java.lang.String,java.util.List<java.lang.String>> equivalenceFeatures)
Instantiates the unifier.
-
-
Method Detail
-
isSatisfied
protected final boolean isSatisfied(AnalyzedToken aToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)
Tests if a token has shared features with other tokens.- Parameters:
aToken- token to be testeduFeatures- features to be tested- Returns:
- true if the token shares this type of feature with other tokens
-
checkNext
private boolean checkNext(AnalyzedToken aToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)
-
startNextToken
public final void startNextToken()
Call after every complete token (AnalyzedTokenReadings) checked.
-
startUnify
public final void startUnify()
Starts testing only those equivalences that were previously matched.
-
getFinalUnificationValue
public final boolean getFinalUnificationValue(java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures)
Make sure that we really matched all the required features of the unification.- Parameters:
uFeatures- Features to be checked- Returns:
- True if the token sequence has been found.
- Since:
- 2.5
-
reset
public final void reset()
Resets after use of unification. Required.
-
getUnifiedTokens
@Nullable public final @Nullable AnalyzedTokenReadings[] getUnifiedTokens()
Gets a full sequence of filtered tokens.- Returns:
- Array of AnalyzedTokenReadings that match equivalence relation
defined for features tested, or
null
-
addTokenToSequence
private void addTokenToSequence(java.util.List<AnalyzedTokenReadings> tokenSequence, AnalyzedToken token, int pos)
-
isUnified
public final boolean isUnified(AnalyzedToken matchToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures, boolean lastReading, boolean isMatched)
Tests if the token sequence is unified.Usage note: to test if the sequence of tokens is unified (i.e., shares a group of features, such as the same gender, number, grammatical case etc.), you need to test all tokens but the last one in the following way: call
To make it work in XML rules, the Elements built based onisUnified()for every reading of a token, and setlastReadingtotrue. For the last token, check the truth value returned by this method. In previous cases, it may actually be discarded before the final check. SeeAbstractPatternRulefor an example.<token>s inside the unify block have to be processed in a special way: namely the last Element has to be marked as the last one (by usingPatternToken.setLastInUnification()).- Parameters:
matchToken-AnalyzedTokentoken to unifylastReading- true when the matchToken is the last reading in theAnalyzedTokenReadingsisMatched- true if the reading matches the element in the pattern rule, otherwise the reading is not considered in the unification- Returns:
- true if the tokens in the sequence are unified
-
isUnified
public final boolean isUnified(AnalyzedToken matchToken, java.util.Map<java.lang.String,java.util.List<java.lang.String>> uFeatures, boolean lastReading)
-
addNeutralElement
public final void addNeutralElement(AnalyzedTokenReadings analyzedTokenReadings)
Used to add neutral elements (AnalyzedTokenReadingsto the unified sequence. Useful if the sequence contains punctuation or connectives, for example.- Parameters:
analyzedTokenReadings- A neutral element to be added.- Since:
- 2.5
-
getFinalUnified
@Nullable public final @Nullable AnalyzedTokenReadings[] getFinalUnified()
Used for getting a unified sequence in case when simple test methodisUnified(AnalyzedToken, Map, boolean)} was used.- Returns:
- An array of
AnalyzedTokenReadingsornullwhen not in unification
-
-