Package jflex.core.unicode
Class UnicodeProperties
- java.lang.Object
-
- jflex.core.unicode.UnicodeProperties
-
public class UnicodeProperties extends java.lang.ObjectUnicode properties that can be bound to a specific Unicode version.Supported unicode versions are defined in
UNICODE_VERSIONS.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classUnicodeProperties.UnsupportedUnicodeVersionException
-
Field Summary
Fields Modifier and Type Field Description private IntCharSet[]caselessMatchesprivate java.lang.StringcaselessMatchPartitionsprivate intcaselessMatchPartitionSizeprivate static java.lang.StringDEFAULT_UNICODE_VERSIONprivate intmaximumCodePointprivate java.util.Map<java.lang.String,IntCharSet>propertyValueIntervalsstatic java.lang.StringUNICODE_VERSIONSHuman-readable list of all supported Unicode versions.private static java.util.regex.PatternWORD_SEP_PATTERN
-
Constructor Summary
Constructors Constructor Description UnicodeProperties()Unpacks the Unicode data corresponding to the default Unicode version.UnicodeProperties(java.lang.String version)Unpacks the Unicode data corresponding to the given version.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private voidbind(java.lang.String[] propertyValues, java.lang.String[] intervals, java.lang.String[] propertyValueAliases, int maximumCodePoint, java.lang.String caselessMatchPartitions, int caselessMatchPartitionSize)Unpacks data for the selected Unicode version, populatingpropertyValueIntervals.private voidbindInvariantIntervals()Adds intervals for \p{ASCII} and \p{Any} topropertyValueIntervals.IntCharSetgetCaselessMatches(int c)Returns a set of character intervals representing all characters that are case-insensitively equivalent to the given character, including the given character itself.IntCharSetgetIntCharSet(java.lang.String propertyValue)Returns the character interval set associated with the given property value for the selected Unicode version.intgetMaximumCodePoint()Returns the maximum code point for the selected Unicode version.java.util.Set<java.lang.String>getPropertyValues()Returns the set of all properties, property values, and their aliases supported by the specified Unicode version.private voidinit(java.lang.String version)Based on the given version, selects and binds the corresponding Unicode data to facilitate mappings from property values to character intervals.private voidinitCaselessMatches()Unpacks the caseless match data.private static java.lang.Stringnormalize(java.lang.String identifier)Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and parentheses; and substituting '=' for every ':'.
-
-
-
Field Detail
-
UNICODE_VERSIONS
public static final java.lang.String UNICODE_VERSIONS
Human-readable list of all supported Unicode versions.- See Also:
- Constant Field Values
-
DEFAULT_UNICODE_VERSION
private static final java.lang.String DEFAULT_UNICODE_VERSION
- See Also:
- Constant Field Values
-
WORD_SEP_PATTERN
private static final java.util.regex.Pattern WORD_SEP_PATTERN
-
maximumCodePoint
private int maximumCodePoint
-
propertyValueIntervals
private final java.util.Map<java.lang.String,IntCharSet> propertyValueIntervals
-
caselessMatchPartitions
private java.lang.String caselessMatchPartitions
-
caselessMatchPartitionSize
private int caselessMatchPartitionSize
-
caselessMatches
private IntCharSet[] caselessMatches
-
-
Constructor Detail
-
UnicodeProperties
public UnicodeProperties() throws UnicodeProperties.UnsupportedUnicodeVersionExceptionUnpacks the Unicode data corresponding to the default Unicode version.- Throws:
UnicodeProperties.UnsupportedUnicodeVersionException- if the default version is not supported.
-
UnicodeProperties
public UnicodeProperties(java.lang.String version) throws UnicodeProperties.UnsupportedUnicodeVersionExceptionUnpacks the Unicode data corresponding to the given version.- Parameters:
version- The Unicode version for which to unpack data- Throws:
UnicodeProperties.UnsupportedUnicodeVersionException- if the given version is not supported.
-
-
Method Detail
-
getMaximumCodePoint
public int getMaximumCodePoint()
Returns the maximum code point for the selected Unicode version.- Returns:
- the maximum code point for the selected Unicode version.
-
getIntCharSet
public IntCharSet getIntCharSet(java.lang.String propertyValue)
Returns the character interval set associated with the given property value for the selected Unicode version.- Parameters:
propertyValue- The Unicode property or property value (or alias for one of these) for which to return the corresponding character intervals.- Returns:
- The character interval set corresponding to the given property value, if a match exists, and null otherwise.
-
getPropertyValues
public java.util.Set<java.lang.String> getPropertyValues()
Returns the set of all properties, property values, and their aliases supported by the specified Unicode version.- Returns:
- The set of all properties supported by the specified Unicode version
-
getCaselessMatches
public IntCharSet getCaselessMatches(int c)
Returns a set of character intervals representing all characters that are case-insensitively equivalent to the given character, including the given character itself.The first call to this method lazily initializes the backing data.
- Parameters:
c- The character for which to return case-insensitive equivalents.- Returns:
- All case-insensitively equivalent characters, or null if the given character is case-insensitively equivalent only to itself.
-
initCaselessMatches
private void initCaselessMatches()
Unpacks the caseless match data. Called fromgetCaselessMatches(int)to lazily initialize.
-
init
private void init(java.lang.String version) throws UnicodeProperties.UnsupportedUnicodeVersionExceptionBased on the given version, selects and binds the corresponding Unicode data to facilitate mappings from property values to character intervals.- Parameters:
version- The Unicode version for which to bind data- Throws:
UnicodeProperties.UnsupportedUnicodeVersionException- if the given version is not supported.
-
bind
private void bind(java.lang.String[] propertyValues, java.lang.String[] intervals, java.lang.String[] propertyValueAliases, int maximumCodePoint, java.lang.String caselessMatchPartitions, int caselessMatchPartitionSize)Unpacks data for the selected Unicode version, populatingpropertyValueIntervals.- Parameters:
propertyValues- The list of property values, in same order as the packed data corresponding to them, in the given intervals, for the selected Unicode version.intervals- The packed character intervals corresponding to and in the same order as the given propertyValues, for the selected Unicode version.propertyValueAliases- Key/value pairs mapping property value aliases to property values, for the selected Unicode version.maximumCodePoint- The maximum code point for the selected Unicode version.caselessMatchPartitions- The packed caseless match partition data for the selected Unicode versioncaselessMatchPartitionSize- The partition data record length (the maximum number of elements in a caseless match partition) for the selected Unicode version.
-
bindInvariantIntervals
private void bindInvariantIntervals()
Adds intervals for \p{ASCII} and \p{Any} topropertyValueIntervals.
-
normalize
private static java.lang.String normalize(java.lang.String identifier)
Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and parentheses; and substituting '=' for every ':'.- Parameters:
identifier- The identifier to normalize- Returns:
- The normalized identifier
-
-