Class MorfologikSpellerRule
- java.lang.Object
-
- org.languagetool.rules.Rule
-
- org.languagetool.rules.spelling.SpellingCheckRule
-
- org.languagetool.rules.spelling.morfologik.MorfologikSpellerRule
-
public abstract class MorfologikSpellerRule extends SpellingCheckRule
-
-
Field Summary
Fields Modifier and Type Field Description private booleancheckCompoundprivate java.util.regex.PatterncompoundRegexprotected java.util.LocaleconversionLocaleprivate booleanignoreTaggedWords(package private) static intMAX_FREQUENCY_FOR_SPLITTINGprivate booleanrunningExperimentprotected MorfologikMultiSpellerspeller1protected MorfologikMultiSpellerspeller2protected MorfologikMultiSpellerspeller3private SuggestionsOrderersuggestionsOrdererprivate UserConfiguserConfig-
Fields inherited from class org.languagetool.rules.spelling.SpellingCheckRule
ignoreWordsWithLength, language, languageModel, LANGUAGETOOL, LANGUAGETOOLER, wordListLoader
-
-
Constructor Summary
Constructors Constructor Description MorfologikSpellerRule(java.util.ResourceBundle messages, Language language)MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig)MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages)MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages, LanguageModel languageModel)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description private booleancanBeIgnored(AnalyzedTokenReadings[] tokens, int idx, AnalyzedTokenReadings token)java.lang.StringgetDescription()A short description of the error this rule can detect, usually in the language of the text that is checked.abstract java.lang.StringgetFileName()Get the filename, e.g., /resource/pl/spelling.dict.protected intgetFrequency(MorfologikMultiSpeller speller, java.lang.String word)abstract java.lang.StringgetId()A string used to identify the rule in e.g.protected java.util.List<RuleMatch>getRuleMatches(java.lang.String word, int startPos, AnalyzedSentence sentence, java.util.List<RuleMatch> ruleMatchesSoFar, int idx, AnalyzedTokenReadings[] tokens)protected booleanignoreWord(java.lang.String word)Ignore surrogate pairs (emojis)private voidinitSpeller(java.lang.String binaryDict)private booleaninitSpellers()booleanisMisspelled(java.lang.String word)protected booleanisMisspelled(MorfologikMultiSpeller speller, java.lang.String word)protected booleanisSurrogatePairCombination(java.lang.String word)Checks whether a given String consists only of surrogate pairs.private java.util.List<java.lang.String>joinBeforeAfterSuggestions(java.util.List<java.lang.String> suggestionsList, java.lang.String beforeSuggestionStr, java.lang.String afterSuggestionStr)Join strings before and after a suggestion.RuleMatch[]match(AnalyzedSentence sentence)Check whether the given sentence matches this error rule, i.e.protected java.util.List<java.lang.String>orderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word)private java.util.List<java.lang.String>orderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word, AnalyzedSentence sentence, int startPos)protected voidsetCheckCompound(boolean checkCompound)protected voidsetCompoundRegex(java.lang.String compoundRegex)voidsetIgnoreTaggedWords()Skip words that are known in the POS tagging dictionary, assuming they cannot be incorrect.voidsetLocale(java.util.Locale locale)@Nullable java.util.regex.PatterntokenizingPattern()Get the regular expression pattern used to tokenize the words as in the source dictionary.-
Methods inherited from class org.languagetool.rules.spelling.SpellingCheckRule
acceptedInAlternativeLanguage, acceptPhrases, addIgnoreTokens, addIgnoreWords, addProhibitedWords, addSuggestionsToRuleMatch, createWrongSplitMatch, expandLine, filterDupes, filterSuggestions, getAdditionalProhibitFileNames, getAdditionalSpellingFileNames, getAdditionalSuggestions, getAdditionalTopSuggestions, getAlternativeLangSpellingRules, getAntiPatterns, getIgnoreFileName, getLanguageVariantSpellingFileName, getProhibitFileName, getSpellingFileName, ignoreToken, ignoreWord, init, isDictionaryBasedSpellingRule, isEMail, isProhibited, isUrl, reorderSuggestions, setConsiderIgnoreWords, setConvertsCase, startsWithIgnoredWord
-
Methods inherited from class org.languagetool.rules.Rule
addExamplePair, estimateContextForSureMatch, getCategory, getConfigureText, getCorrectExamples, getDefaultValue, getErrorTriggeringExamples, getIncorrectExamples, getLocQualityIssueType, getMaxConfigurableValue, getMinConfigurableValue, getSentenceWithImmunization, getUrl, hasConfigurableValue, isDefaultOff, isDefaultTempOff, isOfficeDefaultOff, isOfficeDefaultOn, makeAntiPatterns, setCategory, setCorrectExamples, setDefaultOff, setDefaultOn, setDefaultTempOff, setErrorTriggeringExamples, setIncorrectExamples, setLocQualityIssueType, setOfficeDefaultOff, setOfficeDefaultOn, setUrl, supportsLanguage, toRuleMatchArray, useInOffice
-
-
-
-
Field Detail
-
speller1
protected MorfologikMultiSpeller speller1
-
speller2
protected MorfologikMultiSpeller speller2
-
speller3
protected MorfologikMultiSpeller speller3
-
conversionLocale
protected java.util.Locale conversionLocale
-
suggestionsOrderer
private final SuggestionsOrderer suggestionsOrderer
-
runningExperiment
private final boolean runningExperiment
-
ignoreTaggedWords
private boolean ignoreTaggedWords
-
checkCompound
private boolean checkCompound
-
compoundRegex
private java.util.regex.Pattern compoundRegex
-
userConfig
private final UserConfig userConfig
-
MAX_FREQUENCY_FOR_SPLITTING
static final int MAX_FREQUENCY_FOR_SPLITTING
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
MorfologikSpellerRule
public MorfologikSpellerRule(java.util.ResourceBundle messages, Language language) throws java.io.IOException- Throws:
java.io.IOException
-
MorfologikSpellerRule
public MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig) throws java.io.IOException- Throws:
java.io.IOException
-
MorfologikSpellerRule
public MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages) throws java.io.IOException- Throws:
java.io.IOException
-
MorfologikSpellerRule
public MorfologikSpellerRule(java.util.ResourceBundle messages, Language language, UserConfig userConfig, java.util.List<Language> altLanguages, LanguageModel languageModel) throws java.io.IOException- Throws:
java.io.IOException
-
-
Method Detail
-
getFileName
public abstract java.lang.String getFileName()
Get the filename, e.g., /resource/pl/spelling.dict.
-
getId
public abstract java.lang.String getId()
Description copied from class:RuleA string used to identify the rule in e.g. configuration files. This string is supposed to be unique and to stay the same in all upcoming versions of LanguageTool. It's supposed to contain only the charactersA-Zand the underscore.- Specified by:
getIdin classSpellingCheckRule
-
getDescription
public java.lang.String getDescription()
Description copied from class:RuleA short description of the error this rule can detect, usually in the language of the text that is checked.- Specified by:
getDescriptionin classSpellingCheckRule
-
setLocale
public void setLocale(java.util.Locale locale)
-
setIgnoreTaggedWords
public void setIgnoreTaggedWords()
Skip words that are known in the POS tagging dictionary, assuming they cannot be incorrect.
-
match
public RuleMatch[] match(AnalyzedSentence sentence) throws java.io.IOException
Description copied from class:RuleCheck whether the given sentence matches this error rule, i.e. whether it contains the error detected by this rule. Note that the order in which this method is called is not always guaranteed, i.e. the sentence order in the text may be different than the order in which you get the sentences (this may be the case when LanguageTool is used as a LibreOffice/OpenOffice add-on, for example).- Specified by:
matchin classSpellingCheckRule- Parameters:
sentence- a pre-analyzed sentence- Returns:
- an array of
RuleMatchobjects - Throws:
java.io.IOException
-
initSpellers
private boolean initSpellers() throws java.io.IOException- Throws:
java.io.IOException
-
initSpeller
private void initSpeller(java.lang.String binaryDict) throws java.io.IOException- Throws:
java.io.IOException
-
canBeIgnored
private boolean canBeIgnored(AnalyzedTokenReadings[] tokens, int idx, AnalyzedTokenReadings token) throws java.io.IOException
- Throws:
java.io.IOException
-
isMisspelled
@Experimental public boolean isMisspelled(java.lang.String word) throws java.io.IOException
- Specified by:
isMisspelledin classSpellingCheckRule- Throws:
java.io.IOException- Since:
- 4.8
-
isMisspelled
protected boolean isMisspelled(MorfologikMultiSpeller speller, java.lang.String word)
- Returns:
- true if the word is misspelled
- Since:
- 2.4
-
getFrequency
protected int getFrequency(MorfologikMultiSpeller speller, java.lang.String word)
-
getRuleMatches
protected java.util.List<RuleMatch> getRuleMatches(java.lang.String word, int startPos, AnalyzedSentence sentence, java.util.List<RuleMatch> ruleMatchesSoFar, int idx, AnalyzedTokenReadings[] tokens) throws java.io.IOException
- Throws:
java.io.IOException
-
tokenizingPattern
@Nullable public @Nullable java.util.regex.Pattern tokenizingPattern()
Get the regular expression pattern used to tokenize the words as in the source dictionary. For example, it may contain a hyphen, if the words with hyphens are not included in the dictionary- Returns:
- A compiled
Patternthat is used to tokenize words ornull.
-
orderSuggestions
protected java.util.List<java.lang.String> orderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word)
-
orderSuggestions
private java.util.List<java.lang.String> orderSuggestions(java.util.List<java.lang.String> suggestions, java.lang.String word, AnalyzedSentence sentence, int startPos)
-
setCheckCompound
protected void setCheckCompound(boolean checkCompound)
- Parameters:
checkCompound- If true and the word is not in the dictionary it will be split (seesetCompoundRegex(String)) and each component will be checked separately- Since:
- 2.4
-
setCompoundRegex
protected void setCompoundRegex(java.lang.String compoundRegex)
- Parameters:
compoundRegex- seesetCheckCompound(boolean)- Since:
- 2.4
-
isSurrogatePairCombination
protected boolean isSurrogatePairCombination(java.lang.String word)
Checks whether a given String consists only of surrogate pairs.- Parameters:
word- to be checked- Since:
- 4.2
-
ignoreWord
protected boolean ignoreWord(java.lang.String word) throws java.io.IOExceptionIgnore surrogate pairs (emojis)- Overrides:
ignoreWordin classSpellingCheckRule- Throws:
java.io.IOException- Since:
- 4.3
- See Also:
SpellingCheckRule.ignoreWord(java.lang.String)
-
joinBeforeAfterSuggestions
private java.util.List<java.lang.String> joinBeforeAfterSuggestions(java.util.List<java.lang.String> suggestionsList, java.lang.String beforeSuggestionStr, java.lang.String afterSuggestionStr)Join strings before and after a suggestion. Used when there is also suggestion for split words Ex. to thow > tot how | to throw
-
-