Package org.languagetool.rules.ngrams
Class NgramProbabilityRule
java.lang.Object
org.languagetool.rules.Rule
org.languagetool.rules.ngrams.NgramProbabilityRule
LanguageTool's probability check that uses ngram lookups
to decide if an ngram of the input text is so rare in our
ngram index that it should be considered an error.
Also see http://wiki.languagetool.org/finding-errors-using-n-gram-data.
- Since:
- 3.2
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static class(package private) class(package private) class(package private) static class -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final List<NgramProbabilityRule.AdvancedReplacement> private static final booleanprivate final Languageprivate final LanguageModelprivate doubleprivate static final List<NgramProbabilityRule.Replacement> static final String -
Constructor Summary
ConstructorsConstructorDescriptionNgramProbabilityRule(ResourceBundle messages, LanguageModel languageModel, Language language) -
Method Summary
Modifier and TypeMethodDescriptionprotected booleanacceptMatch(RuleMatch match, Probability p, AnalyzedSentence sentence) Overwrite this method to discard matches by returningfalse.private voidgetBetterAlternatives(GoogleToken prevToken, String token, GoogleToken next, GoogleToken googleToken, Probability p, AnalyzedSentence sentence) private Optional<List<NgramProbabilityRule.Alternative>> getBetterAlternatives(NgramProbabilityRule.Replacement replacement, GoogleToken prevToken, GoogleToken token, GoogleToken next, Probability p) private Optional<AnalyzedToken> getByPosTag(Set<AnalyzedToken> tokens, String wantedPosTagRegex) A short description of the error this rule can detect, usually in the language of the text that is checked.protected TokenizergetId()A string used to identify the rule in e.g.match(AnalyzedSentence sentence) Check whether the given sentence matches this error rule, i.e.voidsetMinProbability(double minProbability) Methods inherited from class org.languagetool.rules.Rule
addExamplePair, estimateContextForSureMatch, getAntiPatterns, getCategory, getConfigureText, getCorrectExamples, getDefaultValue, getErrorTriggeringExamples, getIncorrectExamples, getLocQualityIssueType, getMaxConfigurableValue, getMinConfigurableValue, getSentenceWithImmunization, getUrl, hasConfigurableValue, isDefaultOff, isDefaultTempOff, isDictionaryBasedSpellingRule, isOfficeDefaultOff, isOfficeDefaultOn, makeAntiPatterns, setCategory, setCorrectExamples, setDefaultOff, setDefaultOn, setDefaultTempOff, setErrorTriggeringExamples, setIncorrectExamples, setLocQualityIssueType, setOfficeDefaultOff, setOfficeDefaultOn, setUrl, supportsLanguage, toRuleMatchArray, useInOffice
-
Field Details
-
RULE_ID
- Since:
- 3.2
- See Also:
-
DEBUG
private static final boolean DEBUG- See Also:
-
REPLACEMENTS
-
ADV_REPLACEMENTS
-
lm
-
language
-
minProbability
private double minProbability
-
-
Constructor Details
-
NgramProbabilityRule
public NgramProbabilityRule(ResourceBundle messages, LanguageModel languageModel, Language language)
-
-
Method Details
-
getId
Description copied from class:RuleA string used to identify the rule in e.g. configuration files. This string is supposed to be unique and to stay the same in all upcoming versions of LanguageTool. It's supposed to contain only the charactersA-Zand the underscore. -
setMinProbability
-
match
Description copied from class:RuleCheck whether the given sentence matches this error rule, i.e. whether it contains the error detected by this rule. Note that the order in which this method is called is not always guaranteed, i.e. the sentence order in the text may be different than the order in which you get the sentences (this may be the case when LanguageTool is used as a LibreOffice/OpenOffice add-on, for example).- Specified by:
matchin classRule- Parameters:
sentence- a pre-analyzed sentence- Returns:
- an array of
RuleMatchobjects - Throws:
IOException
-
acceptMatch
Overwrite this method to discard matches by returningfalse.- Since:
- 3.3
-
getBetterAlternatives
private NgramProbabilityRule.Alternatives getBetterAlternatives(GoogleToken prevToken, String token, GoogleToken next, GoogleToken googleToken, Probability p, AnalyzedSentence sentence) throws IOException - Throws:
IOException
-
getBetterAlternatives
private Optional<List<NgramProbabilityRule.Alternative>> getBetterAlternatives(NgramProbabilityRule.Replacement replacement, GoogleToken prevToken, GoogleToken token, GoogleToken next, Probability p) throws IOException - Throws:
IOException
-
getByPosTag
-
getDescription
Description copied from class:RuleA short description of the error this rule can detect, usually in the language of the text that is checked.- Specified by:
getDescriptionin classRule
-
getGoogleStyleWordTokenizer
-
debug
-