Package morfologik.stemming
Class DictionaryMetadata
- java.lang.Object
-
- morfologik.stemming.DictionaryMetadata
-
public final class DictionaryMetadata extends java.lang.ObjectDescription of attributes, their types and default values.
-
-
Field Summary
Fields Modifier and Type Field Description private java.util.EnumMap<DictionaryAttribute,java.lang.String>attributesAll attributes.private java.util.EnumMap<DictionaryAttribute,java.lang.Boolean>boolAttributesAll "enabled" boolean attributes.private java.nio.charset.Charsetcharsetprivate static java.util.Map<DictionaryAttribute,java.lang.String>DEFAULT_ATTRIBUTESDefault attribute values.private EncoderTypeencoderTypeSequence encoder.private java.lang.StringencodingEncoding used for converting bytes to characters and vice versa.private java.util.LinkedHashMap<java.lang.Character,java.util.List<java.lang.Character>>equivalentCharsEquivalent characters (treated similarly as equivalent chars with and without diacritics).private java.util.LinkedHashMap<java.lang.String,java.lang.String>inputConversionConversion pairs for input conversion, for example to replace ligatures.private java.util.Localelocalestatic java.lang.StringMETADATA_FILE_EXTENSIONExpected metadata file extension.private java.util.LinkedHashMap<java.lang.String,java.lang.String>outputConversionConversion pairs for output conversion, for example to replace ligatures.private java.util.LinkedHashMap<java.lang.String,java.util.List<java.lang.String>>replacementPairsReplacement pairs for non-obvious candidate search in a speller dictionary.private static java.util.EnumSet<DictionaryAttribute>REQUIRED_ATTRIBUTESRequired attributes.private byteseparatorA separator character between fields (stem, lemma, form).private charseparatorChar
-
Constructor Summary
Constructors Constructor Description DictionaryMetadata(java.util.Map<DictionaryAttribute,java.lang.String> attrs)Create an instance from an attribute map.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static DictionaryMetadataBuilderbuilder()java.util.Map<DictionaryAttribute,java.lang.String>getAttributes()java.nio.charset.CharsetDecodergetDecoder()java.nio.charset.CharsetEncodergetEncoder()java.lang.StringgetEncoding()java.util.LinkedHashMap<java.lang.Character,java.util.List<java.lang.Character>>getEquivalentChars()static java.lang.StringgetExpectedMetadataFileName(java.lang.String dictionaryFile)Returns the expected name of the metadata file, based on the name of the dictionary file.static java.nio.file.PathgetExpectedMetadataLocation(java.nio.file.Path dictionary)java.util.LinkedHashMap<java.lang.String,java.lang.String>getInputConversionPairs()java.util.LocalegetLocale()java.util.LinkedHashMap<java.lang.String,java.lang.String>getOutputConversionPairs()java.util.LinkedHashMap<java.lang.String,java.util.List<java.lang.String>>getReplacementPairs()bytegetSeparator()chargetSeparatorAsChar()EncoderTypegetSequenceEncoderType()booleanisConvertingCase()booleanisFrequencyIncluded()booleanisIgnoringAllUppercase()booleanisIgnoringCamelCase()booleanisIgnoringDiacritics()booleanisIgnoringNumbers()booleanisIgnoringPunctuation()booleanisSupportingRunOnWords()static DictionaryMetadataread(java.io.InputStream metadataStream)Read dictionary metadata from a property file (stream).voidwrite(java.io.Writer writer)Write dictionary attributes (metadata).
-
-
-
Field Detail
-
DEFAULT_ATTRIBUTES
private static java.util.Map<DictionaryAttribute,java.lang.String> DEFAULT_ATTRIBUTES
Default attribute values.
-
REQUIRED_ATTRIBUTES
private static java.util.EnumSet<DictionaryAttribute> REQUIRED_ATTRIBUTES
Required attributes.
-
separator
private byte separator
A separator character between fields (stem, lemma, form). The character must be within byte range (FSA uses bytes internally).
-
separatorChar
private char separatorChar
-
encoding
private java.lang.String encoding
Encoding used for converting bytes to characters and vice versa.
-
charset
private java.nio.charset.Charset charset
-
locale
private java.util.Locale locale
-
replacementPairs
private java.util.LinkedHashMap<java.lang.String,java.util.List<java.lang.String>> replacementPairs
Replacement pairs for non-obvious candidate search in a speller dictionary.
-
inputConversion
private java.util.LinkedHashMap<java.lang.String,java.lang.String> inputConversion
Conversion pairs for input conversion, for example to replace ligatures.
-
outputConversion
private java.util.LinkedHashMap<java.lang.String,java.lang.String> outputConversion
Conversion pairs for output conversion, for example to replace ligatures.
-
equivalentChars
private java.util.LinkedHashMap<java.lang.Character,java.util.List<java.lang.Character>> equivalentChars
Equivalent characters (treated similarly as equivalent chars with and without diacritics). For example, Polish ł can be specified as equivalent to l. This implements a feature similar to hunspell MAP in the affix file.
-
attributes
private final java.util.EnumMap<DictionaryAttribute,java.lang.String> attributes
All attributes.
-
boolAttributes
private final java.util.EnumMap<DictionaryAttribute,java.lang.Boolean> boolAttributes
All "enabled" boolean attributes.
-
encoderType
private EncoderType encoderType
Sequence encoder.
-
METADATA_FILE_EXTENSION
public static final java.lang.String METADATA_FILE_EXTENSION
Expected metadata file extension.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
DictionaryMetadata
public DictionaryMetadata(java.util.Map<DictionaryAttribute,java.lang.String> attrs)
Create an instance from an attribute map.- Parameters:
attrs- A set ofDictionaryAttributekeys and their associated values.- See Also:
DictionaryMetadataBuilder
-
-
Method Detail
-
getAttributes
public java.util.Map<DictionaryAttribute,java.lang.String> getAttributes()
- Returns:
- Return all metadata attributes.
-
getEncoding
public java.lang.String getEncoding()
-
getSeparator
public byte getSeparator()
-
getLocale
public java.util.Locale getLocale()
-
getInputConversionPairs
public java.util.LinkedHashMap<java.lang.String,java.lang.String> getInputConversionPairs()
-
getOutputConversionPairs
public java.util.LinkedHashMap<java.lang.String,java.lang.String> getOutputConversionPairs()
-
getReplacementPairs
public java.util.LinkedHashMap<java.lang.String,java.util.List<java.lang.String>> getReplacementPairs()
-
getEquivalentChars
public java.util.LinkedHashMap<java.lang.Character,java.util.List<java.lang.Character>> getEquivalentChars()
-
isFrequencyIncluded
public boolean isFrequencyIncluded()
-
isIgnoringPunctuation
public boolean isIgnoringPunctuation()
-
isIgnoringNumbers
public boolean isIgnoringNumbers()
-
isIgnoringCamelCase
public boolean isIgnoringCamelCase()
-
isIgnoringAllUppercase
public boolean isIgnoringAllUppercase()
-
isIgnoringDiacritics
public boolean isIgnoringDiacritics()
-
isConvertingCase
public boolean isConvertingCase()
-
isSupportingRunOnWords
public boolean isSupportingRunOnWords()
-
getDecoder
public java.nio.charset.CharsetDecoder getDecoder()
- Returns:
- Returns a new
CharsetDecoderfor theencoding.
-
getEncoder
public java.nio.charset.CharsetEncoder getEncoder()
- Returns:
- Returns a new
CharsetEncoderfor theencoding.
-
getSequenceEncoderType
public EncoderType getSequenceEncoderType()
- Returns:
- Return sequence encoder type.
-
getSeparatorAsChar
public char getSeparatorAsChar()
-
builder
public static DictionaryMetadataBuilder builder()
- Returns:
- A shortcut returning
DictionaryMetadataBuilder.
-
getExpectedMetadataFileName
public static java.lang.String getExpectedMetadataFileName(java.lang.String dictionaryFile)
Returns the expected name of the metadata file, based on the name of the dictionary file. The expected name is resolved by truncating any file extension ofnameand appendingMETADATA_FILE_EXTENSION.- Parameters:
dictionaryFile- The name of the dictionary (*.dict) file.- Returns:
- Returns the expected name of the metadata file.
-
getExpectedMetadataLocation
public static java.nio.file.Path getExpectedMetadataLocation(java.nio.file.Path dictionary)
- Parameters:
dictionary- The location of the dictionary file.- Returns:
- Returns the expected location of a metadata file.
-
read
public static DictionaryMetadata read(java.io.InputStream metadataStream) throws java.io.IOException
Read dictionary metadata from a property file (stream).- Parameters:
metadataStream- The stream with metadata.- Returns:
- Returns
DictionaryMetadataread from a the stream (property file). - Throws:
java.io.IOException- Thrown if an I/O exception occurs.
-
write
public void write(java.io.Writer writer) throws java.io.IOExceptionWrite dictionary attributes (metadata).- Parameters:
writer- The writer to write to.- Throws:
java.io.IOException- Thrown when an I/O error occurs.
-
-