Package org.languagetool.synthesis
Class BaseSynthesizer
- java.lang.Object
-
- org.languagetool.synthesis.BaseSynthesizer
-
- All Implemented Interfaces:
Synthesizer
public class BaseSynthesizer extends java.lang.Object implements Synthesizer
-
-
Field Summary
Fields Modifier and Type Field Description private morfologik.stemming.Dictionarydictionaryprivate ManualSynthesizermanualSynthesizerprivate SorosnumberSpellerprotected java.util.List<java.lang.String>possibleTagsprivate ManualSynthesizerremovalSynthesizerprivate java.lang.StringresourceFileNameprivate java.lang.StringsorosFileNamejava.lang.StringSPELLNUMBER_TAGprivate morfologik.stemming.IStemmerstemmerprivate java.lang.StringtagFileName
-
Constructor Summary
Constructors Constructor Description BaseSynthesizer(java.lang.String sorosFileName, java.lang.String resourceFileName, java.lang.String tagFileName, Language lang)BaseSynthesizer(java.lang.String resourceFileName, java.lang.String tagFileName, Language lang)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private SoroscreateNumberSpeller(java.lang.String langcode)protected morfologik.stemming.IStemmercreateStemmer()Creates a newIStemmerbased on the configureddictionary.protected morfologik.stemming.DictionarygetDictionary()Returns theDictionaryused for this synthesizer.java.lang.StringgetPosTagCorrection(java.lang.String posTag)Gets a corrected version of the POS tag used for synthesis.java.lang.StringgetSpelledNumber(java.lang.String arabicNumeral)Spells out a numbermorfologik.stemming.IStemmergetStemmer()protected voidinitPossibleTags()protected voidlookup(java.lang.String lemma, java.lang.String posTag, java.util.List<java.lang.String> results)Lookup the inflected forms of a lemma defined by a part-of-speech tag.java.lang.String[]synthesize(AnalyzedToken token, java.lang.String posTag)Get a form of a given AnalyzedToken, where the form is defined by a part-of-speech tag.java.lang.String[]synthesize(AnalyzedToken token, java.lang.String posTag, boolean posTagRegExp)Generates a form of the word with a given POS tag for a given lemma.
-
-
-
Field Detail
-
possibleTags
protected volatile java.util.List<java.lang.String> possibleTags
-
tagFileName
private final java.lang.String tagFileName
-
resourceFileName
private final java.lang.String resourceFileName
-
stemmer
private final morfologik.stemming.IStemmer stemmer
-
manualSynthesizer
private final ManualSynthesizer manualSynthesizer
-
removalSynthesizer
private final ManualSynthesizer removalSynthesizer
-
sorosFileName
private final java.lang.String sorosFileName
-
numberSpeller
private final Soros numberSpeller
-
SPELLNUMBER_TAG
public final java.lang.String SPELLNUMBER_TAG
- See Also:
- Constant Field Values
-
dictionary
private volatile morfologik.stemming.Dictionary dictionary
-
-
Constructor Detail
-
BaseSynthesizer
public BaseSynthesizer(java.lang.String sorosFileName, java.lang.String resourceFileName, java.lang.String tagFileName, Language lang)- Parameters:
resourceFileName- The dictionary file name.tagFileName- The name of a file containing all possible tags.
-
BaseSynthesizer
public BaseSynthesizer(java.lang.String resourceFileName, java.lang.String tagFileName, Language lang)
-
-
Method Detail
-
getDictionary
protected morfologik.stemming.Dictionary getDictionary() throws java.io.IOExceptionReturns theDictionaryused for this synthesizer. The dictionary file can be defined in theconstructor.- Throws:
java.io.IOException- In case the dictionary cannot be loaded.
-
createStemmer
protected morfologik.stemming.IStemmer createStemmer()
Creates a newIStemmerbased on the configureddictionary. The result must not be shared among threads.- Since:
- 2.3
-
createNumberSpeller
private Soros createNumberSpeller(java.lang.String langcode)
-
lookup
protected void lookup(java.lang.String lemma, java.lang.String posTag, java.util.List<java.lang.String> results)Lookup the inflected forms of a lemma defined by a part-of-speech tag.- Parameters:
lemma- the lemma to be inflected.posTag- the desired part-of-speech tag.results- the list to collect the inflected forms.
-
synthesize
public java.lang.String[] synthesize(AnalyzedToken token, java.lang.String posTag) throws java.io.IOException
Get a form of a given AnalyzedToken, where the form is defined by a part-of-speech tag.- Specified by:
synthesizein interfaceSynthesizer- Parameters:
token- AnalyzedToken to be inflected.posTag- The desired part-of-speech tag.- Returns:
- inflected words, or an empty array if no forms were found
- Throws:
java.io.IOException
-
synthesize
public java.lang.String[] synthesize(AnalyzedToken token, java.lang.String posTag, boolean posTagRegExp) throws java.io.IOException
Description copied from interface:SynthesizerGenerates a form of the word with a given POS tag for a given lemma. POS tag can be specified using regular expressions.- Specified by:
synthesizein interfaceSynthesizer- Parameters:
token- the token to be used for synthesisposTag- POS tag of the form to be generatedposTagRegExp- Specifies whether the posTag string is a regular expression.- Throws:
java.io.IOException
-
getPosTagCorrection
public java.lang.String getPosTagCorrection(java.lang.String posTag)
Description copied from interface:SynthesizerGets a corrected version of the POS tag used for synthesis. Useful when the tagset defines special disjunction that need to be converted into regexp disjunctions.- Specified by:
getPosTagCorrectionin interfaceSynthesizer- Parameters:
posTag- original POS tag to correct- Returns:
- converted POS tag
-
getStemmer
public morfologik.stemming.IStemmer getStemmer()
- Returns:
- the stemmer interface to be used.
- Since:
- 2.5
-
initPossibleTags
protected void initPossibleTags() throws java.io.IOException- Throws:
java.io.IOException
-
getSpelledNumber
public java.lang.String getSpelledNumber(java.lang.String arabicNumeral)
Description copied from interface:SynthesizerSpells out a number- Specified by:
getSpelledNumberin interfaceSynthesizer- Parameters:
arabicNumeral- in arabic numerals- Returns:
- String of the spelled out number
-
-