Class UnitFormat
- All Implemented Interfaces:
Serializable,Cloneable,javax.measure.format.UnitFormat,Localized
java.text and the API from javax.measure.format.
In addition to the symbols of the Système international (SI), this class is also capable to handle
some symbols found in Well Known Text (WKT) definitions or in XML files.
Parsing authority codes
If a character sequence given to theparse(CharSequence) method is of the form "EPSG:####",
"urn:ogc:def:uom:EPSG::####" or "http://www.opengis.net/def/uom/EPSG/0/####" (ignoring case
and whitespaces around path separators), then "####" is parsed as an integer and forwarded to the
Units.valueOfEPSG(int) method.
Note on netCDF unit symbols
In netCDF files, values of "unit" attribute are concatenations of an angular unit with an axis direction, as in"degrees_east" or "degrees_north". This class ignores those suffixes and unconditionally
returns Units.DEGREE for all axis directions.
Multi-threading
UnitFormat is generally not thread-safe. If units need to be parsed or formatted in different threads,
each thread should have its own UnitFormat instance.- Since:
- 0.8
- Version:
- 1.3
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static final classRepresents an operation to be applied between two terms parsed byparseTerm(CharSequence, int, int, Operation).private static final classParse position when text to be parsed is expected to contain nothing else than a unit symbol.static enumIdentify whether unit formatting uses ASCII symbols, Unicode symbols or full localized names.Nested classes/interfaces inherited from class java.text.Format
Format.Field -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final StringThe unit name for degrees (not necessarily angular), to be handled in a special way.(package private) static final UnitFormatThe default instance used byUnits.valueOf(String)for parsing units of measurement.Units associated to a given label (in addition to the system-wideUnitRegistry).private LocaleThe locale specified at construction time or modified bysetLocale(Locale).Mapping from long localized and unlocalized names to unit instances.private static final booleanWhether the parsing of authority codes such as"EPSG:9001"is allowed.private static final longFor cross-version compatibility.private static final WeakValueHashMap<Locale,Map<String, javax.measure.Unit<?>>> Cached values ofnameToUnit, for avoiding to load the same information many time and for saving memory if the user create manyUnitFormatinstances.private UnitFormat.StyleWhether thisUnitFormatshould format long names like "metre" or use unit symbols.private ResourceBundleThe mapping from unit symbols to long localized names.Symbols or names to use for formatting units in replacement to the default unit symbols or names.private static final StringThe unit name for dimensionless unit. -
Constructor Summary
ConstructorsModifierConstructorDescriptionprivateCreates the uniqueINSTANCE.UnitFormat(Locale locale) Creates a new format for the given locale. -
Method Summary
Modifier and TypeMethodDescriptionclone()Returns a clone of this unit format.private static ObjectClones the given map, which can be either aHashMapor the instance returned byCollections.emptyMap().private static voidcopy(Locale locale, ResourceBundle symbolToName, Map<String, javax.measure.Unit<?>> nameToUnit) Copies all entries from the given "symbols to names" mapping to the given "names to units" mapping.private static intexponentOperator(CharSequence symbols, int i, int length) Returns0or1if the'*'character at the given index stands for exponentiation instead of multiplication, or a negative value if the character stands for multiplication.private static voidfinish(ParsePosition pos) Reports that the parsing is finished and no more content should be parsed.format(Object unit, StringBuffer toAppendTo, FieldPosition pos) Formats the specified unit in the given buffer.format(javax.measure.Unit<?> unit) Formats the given unit.format(javax.measure.Unit<?> unit, Appendable toAppendTo) Formats the specified unit.private static voidformatComponent(Map.Entry<?, ? extends Number> entry, boolean inverse, UnitFormat.Style style, Appendable toAppendTo) Formats a single unit or dimension raised to the given power.(package private) static voidformatComponents(Map<?, ? extends Number> components, UnitFormat.Style style, Appendable toAppendTo) Creates a new symbol (e.g.private static voidformatSymbol(Object base, UnitFormat.Style style, Appendable toAppendTo) Appends the symbol for the given base unit of base dimension, or "?" if no symbol was found.private javax.measure.Unit<?>Returns the unit instance for the given long (un)localized name.(package private) static ResourceBundleLoads theUnitNamesresource bundle for the given locale.Returns the locale used by thisUnitFormat.getStyle()Returns whether unit formatting uses ASCII symbols, Unicode symbols or full localized names.private static booleanhasDigit(CharSequence symbol, int lower, int upper) Returnstrueif the given character sequence contains at least one digit.private static booleanisDecimalSeparator(CharSequence symbols, int i, int length) Returnstrueif the'.'character at the given index is surrounded by digits or is at the beginning or the end of the character sequences.private static booleanisDigit(int c) Returnstrueif the given character is a digit in the sense of theUnitFormatparser.private static booleanisDivisor(int c) Returnstrueif the given character is the sign of a division operator.booleanReturns whether thisUnitFormatdepends on theLocalegiven at construction time for performing its tasks.private static booleanisSign(int c) Returnstrueif the given character is the sign of a number according theUnitFormatparser.voidAttaches a label to the specified unit.javax.measure.Unit<?>parse(CharSequence symbols) Parses the given text as an instance ofUnit.javax.measure.Unit<?>parse(CharSequence symbols, ParsePosition position) Parses a portion of the given text as an instance ofUnit.private static doubleParses a multiplication factor, which may be a single number or a base raised to an exponent.parseObject(String source) Parses text from a string to produce a unit.parseObject(String source, ParsePosition pos) Parses text from a string to produce a unit, or returnsnullif the parsing failed.private javax.measure.Unit<?>parseTerm(CharSequence symbols, int lower, int upper, UnitFormat.Operation operation) Parses a single unit symbol with its exponent.voidSets the locale that thisUnitFormatwill use for long names.voidsetStyle(UnitFormat.Style style) Sets whether unit formatting should use ASCII symbols, Unicode symbols or full localized names.private ResourceBundleReturns the mapping from unit symbols to long localized names.Methods inherited from class java.text.Format
format, formatToCharacterIterator
-
Field Details
-
serialVersionUID
private static final long serialVersionUIDFor cross-version compatibility.- See Also:
-
PARSE_AUTHORITY_CODES
private static final boolean PARSE_AUTHORITY_CODESWhether the parsing of authority codes such as"EPSG:9001"is allowed.- See Also:
-
DEGREES
The unit name for degrees (not necessarily angular), to be handled in a special way. Must contain only ASCII lower case letters ([a … z]).- See Also:
-
UNITY
The unit name for dimensionless unit.- See Also:
-
INSTANCE
The default instance used byUnits.valueOf(String)for parsing units of measurement. WhileUnitFormatis generally not thread-safe, this particular instance is safe if we never invoke any setter method and we do not format withUnitFormat.Style.NAME. -
locale
The locale specified at construction time or modified bysetLocale(Locale).- See Also:
-
style
Whether thisUnitFormatshould format long names like "metre" or use unit symbols.- See Also:
-
unitToLabel
Symbols or names to use for formatting units in replacement to the default unit symbols or names. TheUnitinstances are the ones specified by user in calls tolabel(Unit, String).- See Also:
-
labelToUnit
Units associated to a given label (in addition to the system-wideUnitRegistry). This map is the converse ofunitToLabel. TheUnitinstances may differ from the ones specified by user sinceAbstractUnit.symbolmay have been set to the label specified by the user. The labels may contain some characters normally not allowed in unit symbols, like white spaces.- See Also:
-
symbolToName
The mapping from unit symbols to long localized names. Those resources are locale-dependent and loaded when first needed.- See Also:
-
nameToUnit
Mapping from long localized and unlocalized names to unit instances. This map is used only for parsing and created when first needed.- See Also:
-
SHARED
Cached values ofnameToUnit, for avoiding to load the same information many time and for saving memory if the user create manyUnitFormatinstances. Note that we do not cachesymbolToNamebecauseResourceBundlealready provides its own caching mechanism.- See Also:
-
-
Constructor Details
-
UnitFormat
private UnitFormat()Creates the uniqueINSTANCE. -
UnitFormat
Creates a new format for the given locale.- Parameters:
locale- the locale to use for parsing and formatting units.
-
-
Method Details
-
getLocale
Returns the locale used by thisUnitFormat. -
setLocale
Sets the locale that thisUnitFormatwill use for long names. For example, a call tosetLocale(Locale.US)instructs this formatter to use the “meter” spelling instead of “metre”.- Parameters:
locale- the new locale for thisUnitFormat.- See Also:
-
isLocaleSensitive
public boolean isLocaleSensitive()Returns whether thisUnitFormatdepends on theLocalegiven at construction time for performing its tasks. This method returnstrueif formatting long names (e.g. “metre” or “meter”} andfalseif formatting only the unit symbol (e.g. “m”).- Specified by:
isLocaleSensitivein interfacejavax.measure.format.UnitFormat- Returns:
trueif formatting depends on the locale.
-
getStyle
Returns whether unit formatting uses ASCII symbols, Unicode symbols or full localized names.- Returns:
- the style of units formatted by this
UnitFormatinstance.
-
setStyle
Sets whether unit formatting should use ASCII symbols, Unicode symbols or full localized names.- Parameters:
style- the desired style of units.
-
label
Attaches a label to the specified unit. A label can be a substitute to either the unit symbol or theunit name, depending on the format style. If the specified label is already associated to another unit, then the previous association is discarded.Restriction on character set
Current implementation accepts only letters, subscripts, spaces (including non-breaking spaces but not CR/LF characters), the degree sign (°) and a few other characters like underscore. The set of legal characters may be expanded in future Apache SIS versions, but the following restrictions are likely to remain:- The following characters are reserved since they have special meaning in UCUM format, in URI
or in Apache SIS parser:
" # ( ) * + - . / : = ? [ ] { } ^ ⋅ ∕
- The symbol cannot begin or end with digits, since such digits would be confused with unit power.
- Specified by:
labelin interfacejavax.measure.format.UnitFormat- Parameters:
unit- the unit being labeled.label- the new label for the given unit.- Throws:
IllegalArgumentException- if the given label is not a valid unit name.
- The following characters are reserved since they have special meaning in UCUM format, in URI
or in Apache SIS parser:
-
getBundle
Loads theUnitNamesresource bundle for the given locale. -
symbolToName
Returns the mapping from unit symbols to long localized names. This mapping is loaded when first needed and memorized as long as the locale does not change. -
fromName
Returns the unit instance for the given long (un)localized name. This method is somewhat the converse ofsymbolToName(), but recognizes also international and American spelling of unit names in addition of localized names. The intent is to recognize "meter" as well as "metre".While we said that
UnitFormatis not thread safe, we make an exception for this method for allowing the singletonINSTANCEto parse symbols in a multi-threads environment.- Parameters:
uom- the unit symbol, without leading or trailing spaces.- Returns:
- the unit for the given name, or
nullif unknown.
-
copy
private static void copy(Locale locale, ResourceBundle symbolToName, Map<String, javax.measure.Unit<?>> nameToUnit) Copies all entries from the given "symbols to names" mapping to the given "names to units" mapping. During this copy, keys are converted from symbols to names and values are converted from symbols toUnitinstance. We useUnitvalues instead of their symbols because allUnitinstances are created atUnitsclass initialization anyway (so we do not create new instance here), and it avoid to retain references to theStringinstances loaded by the resource bundle. -
format
Formats the specified unit. This method performs the first of the following actions that can be done.- If a label has been specified for the given unit, then that label is appended unconditionally.
- Otherwise if the formatting style is
UnitFormat.Style.NAMEand theUnit.getName()method returns a non-null value, then that value is appended.Unitinstances implemented by Apache SIS are handled in a special way for localizing the name according the locale specified to this format. - Otherwise if the
Unit.getSymbol()method returns a non-null value, then that value is appended. - Otherwise a default symbol is created from the entries returned by
Unit.getBaseUnits().
- Specified by:
formatin interfacejavax.measure.format.UnitFormat- Parameters:
unit- the unit to format.toAppendTo- where to format the unit.- Returns:
- the given
toAppendToargument, for method calls chaining. - Throws:
IOException- if an error occurred while writing to the destination.
-
formatComponents
static void formatComponents(Map<?, ? extends Number> components, UnitFormat.Style style, Appendable toAppendTo) throws IOExceptionCreates a new symbol (e.g. "m/s") from the given symbols and factors. Keys in the given map can be eitherUnitorDimensioninstances. Values in the given map are eitherIntegerorFractioninstances.- Parameters:
components- the components of the symbol to format.style- whether to allow Unicode characters.toAppendTo- where to write the symbol.- Throws:
IOException
-
formatComponent
private static void formatComponent(Map.Entry<?, ? extends Number> entry, boolean inverse, UnitFormat.Style style, Appendable toAppendTo) throws IOExceptionFormats a single unit or dimension raised to the given power.- Parameters:
entry- the base unit or base dimension to format, together with its power.inverse-truefor inverting the power sign.style- whether to allow Unicode characters.- Throws:
IOException
-
formatSymbol
private static void formatSymbol(Object base, UnitFormat.Style style, Appendable toAppendTo) throws IOException Appends the symbol for the given base unit of base dimension, or "?" if no symbol was found. If the given object is a unit, then it should be an instance ofSystemUnit.- Parameters:
base- the base unit or base dimension to format.style- whether to allow Unicode characters.toAppendTo- where to append the symbol.- Throws:
IOException
-
format
Formats the specified unit in the given buffer. This method delegates toformat(Unit, Appendable). -
format
Formats the given unit. This method delegates toformat(Unit, Appendable).- Specified by:
formatin interfacejavax.measure.format.UnitFormat- Parameters:
unit- the unit to format.- Returns:
- the formatted unit.
-
exponentOperator
Returns0or1if the'*'character at the given index stands for exponentiation instead of multiplication, or a negative value if the character stands for multiplication. This check is used for heuristic rules at parsing time. Current implementation applies the following rules:- The operation is presumed an exponentiation if the '*' symbol is doubled, as in
"m**s-1". - The operation is presumed an exponentiation if it is surrounded by digits or a sign on its right side.
Example:
"10*-6", which means 1E-6 in UCUM syntax. - All other cases are currently presumed multiplication.
Example:
"m*s".
- Returns:
- -1 for parsing as a multiplication, or a positive value for exponentiation. If positive, this is the number of characters in the exponent symbol minus 1.
- The operation is presumed an exponentiation if the '*' symbol is doubled, as in
-
isDecimalSeparator
Returnstrueif the'.'character at the given index is surrounded by digits or is at the beginning or the end of the character sequences. This check is used for heuristic rules.- See Also:
-
isDigit
private static boolean isDigit(int c) Returnstrueif the given character is a digit in the sense of theUnitFormatparser. Note that "digit" is taken here in a much more restrictive way thanCharacter.isDigit(int).A return value of
trueguarantees that the given character is in the Basic Multilingual Plane (BMP). Consequently, thecargument value does not need to be the result ofString.codePointAt(int); the result ofString.charAt(int)is sufficient. We nevertheless use theinttype for avoiding the need to cast if caller uses code points for another reason.- See Also:
-
isSign
private static boolean isSign(int c) Returnstrueif the given character is the sign of a number according theUnitFormatparser. A return value oftrueguarantees that the given character is in the Basic Multilingual Plane (BMP). Consequently, thecargument value does not need to be the result ofString.codePointAt(int). -
isDivisor
private static boolean isDivisor(int c) Returnstrueif the given character is the sign of a division operator. A return value oftrueguarantees that the given character is in the Basic Multilingual Plane (BMP). Consequently, thecargument value does not need to be the result ofString.codePointAt(int). -
hasDigit
Returnstrueif the given character sequence contains at least one digit. This is a hack for allowing to recognize units like "100 feet" (in principle not legal, but seen in practice). This verification has some value if digits are not allowed as unit label or symbol. -
finish
Reports that the parsing is finished and no more content should be parsed. This method is invoked when the last parsed term is possibly one or more words instead of unit symbols. The intent is to avoid trying to parse "degree minute" as "degree × minute". By contrast, this method is not invoked if the string to parse is "m kg**-2" because it can be interpreted as "m × kg**-2". -
parse
public javax.measure.Unit<?> parse(CharSequence symbols) throws javax.measure.format.ParserException Parses the given text as an instance ofUnit. If the parse completes without reading the entire length of the text, an exception is thrown.The parsing is lenient: symbols can be products or quotients of units like “m∕s”, words like “meters per second”, or authority codes like
"urn:ogc:def:uom:EPSG::1026". The product operator can be either'.'(ASCII) or'⋅'(Unicode) character. Exponent after symbol can be decimal digits as in “m2” or a superscript as in “m²”.This method differs from
parse(CharSequence, ParsePosition)in the treatment of white spaces: that method with aParsePositionargument stops parsing at the first white space, while thisparse(…)method treats white spaces as multiplications. The reason for this difference is that white space is normally not a valid multiplication symbol; it could be followed by a text which is not part of the unit symbol. But in the case of thisparse(CharSequence)method, the wholeCharSequenceshall be a unit symbol. In such case, white spaces are less ambiguous.The default implementation delegates to
parse(symbols, new ParsePosition(0))and verifies that all non-white characters have been parsed. Units separated by spaces are multiplied; for example "kg m**-2" is parsed as kg/m².- Specified by:
parsein interfacejavax.measure.format.UnitFormat- Parameters:
symbols- the unit symbols or URI to parse.- Returns:
- the unit parsed from the specified symbols.
- Throws:
javax.measure.format.ParserException- if a problem occurred while parsing the given symbols.- See Also:
-
parse
public javax.measure.Unit<?> parse(CharSequence symbols, ParsePosition position) throws javax.measure.format.ParserException Parses a portion of the given text as an instance ofUnit. Parsing begins at the index given byParsePosition.getIndex(). After parsing, the above-cited index is updated to the first unparsed character.The parsing is lenient: symbols can be products or quotients of units like “m∕s”, words like “meters per second”, or authority codes like
"urn:ogc:def:uom:EPSG::1026". The product operator can be either'.'(ASCII) or'⋅'(Unicode) character. Exponent after symbol can be decimal digits as in “m2” or a superscript as in “m²”.Note that contrarily to
parseObject(String, ParsePosition), this method never returnnull. If an error occurs at parsing time, an uncheckedParserExceptionis thrown.- Parameters:
symbols- the unit symbols to parse.position- on input, index of the first character to parse. On output, index after the last parsed character.- Returns:
- the unit parsed from the specified symbols.
- Throws:
javax.measure.format.ParserException- if a problem occurred while parsing the given symbols.
-
parseTerm
private javax.measure.Unit<?> parseTerm(CharSequence symbols, int lower, int upper, UnitFormat.Operation operation) throws javax.measure.format.ParserException Parses a single unit symbol with its exponent. The given symbol shall not contain multiplication or division operator except in exponent. Parsing of fractional exponent as in "m2/3" is supported; other operations in the exponent will cause an exception to be thrown.- Parameters:
symbols- the complete string specified by the user.lower- index where to begin parsing in thesymbolsstring.upper- index after the last character to parse in thesymbolsstring.operation- the operation to be applied (e.g. the term to be parsed is a multiplier or divisor of another unit).- Returns:
- the parsed unit symbol (never
null). - Throws:
javax.measure.format.ParserException- if a problem occurred while parsing the given symbols.
-
parseMultiplicationFactor
Parses a multiplication factor, which may be a single number or a base raised to an exponent. For example, all the following strings are equivalent: "1000", "1000.0", "1E3", "10*3", "10^3", "10³".- Throws:
NumberFormatException
-
parseObject
Parses text from a string to produce a unit. The default implementation delegates toparse(CharSequence)and wraps theParserExceptioninto aParseExceptionfor compatibility withjava.textAPI.- Overrides:
parseObjectin classFormat- Parameters:
source- the text, part of which should be parsed.- Returns:
- a unit parsed from the string.
- Throws:
ParseException- if the given string cannot be fully parsed.
-
parseObject
Parses text from a string to produce a unit, or returnsnullif the parsing failed. The default implementation delegates toparse(CharSequence, ParsePosition)and catches theParserException.- Specified by:
parseObjectin classFormat- Parameters:
source- the text, part of which should be parsed.pos- index and error index information as described above.- Returns:
- a unit parsed from the string, or
nullin case of error.
-
clone
-
clone
Clones the given map, which can be either aHashMapor the instance returned byCollections.emptyMap().
-