Package com.univocity.parsers.common
Class NormalizedString
- java.lang.Object
-
- com.univocity.parsers.common.NormalizedString
-
- All Implemented Interfaces:
java.io.Serializable,java.lang.CharSequence,java.lang.Comparable<NormalizedString>
public final class NormalizedString extends java.lang.Object implements java.io.Serializable, java.lang.Comparable<NormalizedString>, java.lang.CharSequence
ANormalizedStringallows representing text in a normalized fashion. Strings with different character case or surrounding whitespace are considered the same. Used to represent groups of fields, where users may refer to their names using different character cases or whitespaces. Where the character case or the surrounding space is relevant, theNormalizedStringwill have itsisLiteral()method returntrue, meaning the exact character case and surrounding whitespaces are required for matching it. InvokingvalueOf(String)with aStringsurrounded by single quotes will create a literalNormalizedString. UseliteralValueOf(String)to obtain the sameNormalizedStringwithout having to introduce single quotes.- See Also:
- Serialized Form
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description charcharAt(int index)intcompareTo(NormalizedString o)intcompareTo(java.lang.String o)Compares aNormalizedStringagainst aStringlexicographically.booleanequals(java.lang.Object anObject)static StringCache<NormalizedString>getCache()Returns the internal string cache to allow users to tweak its size limit or clear it when appropriateinthashCode()static booleanidentifyLiterals(NormalizedString[] strings)Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes.static booleanidentifyLiterals(NormalizedString[] strings, boolean lowercaseIdentifiers, boolean uppercaseIdentifiers)Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes.booleanisLiteral()intlength()static NormalizedStringliteralValueOf(java.lang.String string)Creates a literalNormalizedString, meaning it will only match with otherStringorNormalizedStringif they have the exact same content including character case and surrounding whitespaces.java.lang.CharSequencesubSequence(int start, int end)static java.lang.String[]toArray(NormalizedString... args)Converts multiple normalized strings into an array ofString.static NormalizedString[]toArray(java.lang.String... args)Converts multiple plain strings into an array ofNormalizedString.static NormalizedString[]toArray(java.util.Collection<java.lang.String> args)Converts a collection of plain strings into an array ofNormalizedStringstatic java.util.ArrayList<NormalizedString>toArrayList(java.lang.String... args)Converts multiple plain strings into anArrayListofNormalizedString.static java.util.ArrayList<NormalizedString>toArrayList(java.util.Collection<java.lang.String> args)Converts multiple plain strings into anArrayListofNormalizedString.static java.util.ArrayList<java.lang.String>toArrayListOfStrings(NormalizedString... args)Converts multiple normalized strings into aHashSetofString.static java.util.ArrayList<java.lang.String>toArrayListOfStrings(java.util.Collection<NormalizedString> args)Converts multiple normalized strings into aHashSetofString.static java.util.HashSet<NormalizedString>toHashSet(java.lang.String... args)Converts multiple plain strings into aHashSetofNormalizedString.static java.util.HashSet<NormalizedString>toHashSet(java.util.Collection<java.lang.String> args)Converts multiple plain strings into aHashSetofNormalizedString.static java.util.HashSet<java.lang.String>toHashSetOfStrings(NormalizedString... args)Converts multiple normalized strings into aHashSetofString.static java.util.HashSet<java.lang.String>toHashSetOfStrings(java.util.Collection<NormalizedString> args)Converts multiple normalized strings into aHashSetofString.static NormalizedString[]toIdentifierGroupArray(NormalizedString[] strings)Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes.static NormalizedString[]toIdentifierGroupArray(java.lang.String[] strings)Analyzes a group of String to identify any instances whose normalized content will generate clashes.static java.util.LinkedHashSet<NormalizedString>toLinkedHashSet(java.lang.String... args)Converts multiple plain strings into aLinkedHashSetofNormalizedString.static java.util.LinkedHashSet<NormalizedString>toLinkedHashSet(java.util.Collection<java.lang.String> args)Converts multiple plain strings into aLinkedHashSetofNormalizedString.static java.util.LinkedHashSet<java.lang.String>toLinkedHashSetOfStrings(NormalizedString... args)Converts multiple normalized strings into aLinkedHashSetofString.static java.util.LinkedHashSet<java.lang.String>toLinkedHashSetOfStrings(java.util.Collection<NormalizedString> args)Converts multiple normalized strings into aLinkedHashSetofString.NormalizedStringtoLiteral()Returns the literal representation of thisNormalizedString, meaning it will only match with otherStringorNormalizedStringif they have the exact same content including character case and surrounding whitespaces.java.lang.StringtoString()static java.lang.String[]toStringArray(java.util.Collection<NormalizedString> args)Converts a collection of normalized strings into an array ofStringstatic java.util.TreeSet<NormalizedString>toTreeSet(java.lang.String... args)Converts multiple plain strings into aTreeSetofNormalizedString.static java.util.TreeSet<NormalizedString>toTreeSet(java.util.Collection<java.lang.String> args)Converts multiple plain strings into aTreeSetofNormalizedString.static java.util.TreeSet<java.lang.String>toTreeSetOfStrings(NormalizedString... args)Converts multiple normalized strings into aHashSetofString.static java.util.TreeSet<java.lang.String>toTreeSetOfStrings(java.util.Collection<NormalizedString> args)Converts multiple normalized strings into aHashSetofString.static NormalizedString[]toUniqueArray(java.lang.String... args)Converts multiple plain strings into an array ofNormalizedString, ensuring no duplicateNormalizedStringelements exist, even if their originalStrings are different.static java.lang.StringvalueOf(NormalizedString string)Converts aNormalizedStringback to its originalStringrepresentationstatic NormalizedStringvalueOf(java.lang.Object o)Creates a non-literalNormalizedString, meaning it will match with otherStringorNormalizedStringregardless of different including character case and surrounding whitespaces.static NormalizedStringvalueOf(java.lang.String string)Creates a non-literalNormalizedString, meaning it will match with otherStringorNormalizedStringregardless of different including character case and surrounding whitespaces.
-
-
-
Method Detail
-
isLiteral
public boolean isLiteral()
-
equals
public boolean equals(java.lang.Object anObject)
- Overrides:
equalsin classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCodein classjava.lang.Object
-
length
public int length()
- Specified by:
lengthin interfacejava.lang.CharSequence
-
charAt
public char charAt(int index)
- Specified by:
charAtin interfacejava.lang.CharSequence
-
subSequence
public java.lang.CharSequence subSequence(int start, int end)- Specified by:
subSequencein interfacejava.lang.CharSequence
-
compareTo
public int compareTo(NormalizedString o)
- Specified by:
compareToin interfacejava.lang.Comparable<NormalizedString>
-
compareTo
public int compareTo(java.lang.String o)
Compares aNormalizedStringagainst aStringlexicographically.- Parameters:
o- a plainString- Returns:
- the result of
String.compareTo(String). If thisNormalizedStringis a literal, the original argument string will be compared. If thisNormalizedStringis not a literal, the result will be from the comparison of the normalized content of both strings (i.e. surrounding whitespaces and character case differences will be ignored).
-
toString
public java.lang.String toString()
- Specified by:
toStringin interfacejava.lang.CharSequence- Overrides:
toStringin classjava.lang.Object
-
literalValueOf
public static NormalizedString literalValueOf(java.lang.String string)
Creates a literalNormalizedString, meaning it will only match with otherStringorNormalizedStringif they have the exact same content including character case and surrounding whitespaces.- Parameters:
string- the inputString- Returns:
- the literal
NormalizedStringversion of the given string.
-
valueOf
public static NormalizedString valueOf(java.lang.Object o)
Creates a non-literalNormalizedString, meaning it will match with otherStringorNormalizedStringregardless of different including character case and surrounding whitespaces. If the input value is enclosed with single quotes, a literalNormalizedStringwill be returned, as described inliteralValueOf(String)- Parameters:
o- the input object whoseStringrepresentation will be used- Returns:
- the
NormalizedStringof the given object.
-
valueOf
public static NormalizedString valueOf(java.lang.String string)
Creates a non-literalNormalizedString, meaning it will match with otherStringorNormalizedStringregardless of different including character case and surrounding whitespaces. If the input string is enclosed with single quotes, a literalNormalizedStringwill be returned, as described inliteralValueOf(String)- Parameters:
string- the input string- Returns:
- the
NormalizedStringof the given string.
-
valueOf
public static java.lang.String valueOf(NormalizedString string)
Converts aNormalizedStringback to its originalStringrepresentation- Parameters:
string- the normalized string- Returns:
- the original string used to create the given normalized representation.
-
toArray
public static NormalizedString[] toArray(java.util.Collection<java.lang.String> args)
Converts a collection of plain strings into an array ofNormalizedString- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toStringArray
public static java.lang.String[] toStringArray(java.util.Collection<NormalizedString> args)
Converts a collection of normalized strings into an array ofString- Parameters:
args- the normalized strings to convert back to toString- Returns:
- the
Stringrepresentations of all normalized strings.
-
toUniqueArray
public static NormalizedString[] toUniqueArray(java.lang.String... args)
Converts multiple plain strings into an array ofNormalizedString, ensuring no duplicateNormalizedStringelements exist, even if their originalStrings are different.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toArray
public static NormalizedString[] toArray(java.lang.String... args)
Converts multiple plain strings into an array ofNormalizedString.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toArray
public static java.lang.String[] toArray(NormalizedString... args)
Converts multiple normalized strings into an array ofString.- Parameters:
args- the normalized strings to convert toString- Returns:
- the
Stringrepresentations of all input strings.
-
toArrayList
public static java.util.ArrayList<NormalizedString> toArrayList(java.lang.String... args)
Converts multiple plain strings into anArrayListofNormalizedString.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toArrayList
public static java.util.ArrayList<NormalizedString> toArrayList(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into anArrayListofNormalizedString.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toArrayListOfStrings
public static java.util.ArrayList<java.lang.String> toArrayListOfStrings(NormalizedString... args)
Converts multiple normalized strings into aHashSetofString.- Parameters:
args- the normalized strings to convert toString- Returns:
- the original
Strings of all input normalized strings.
-
toArrayListOfStrings
public static java.util.ArrayList<java.lang.String> toArrayListOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aHashSetofString.- Parameters:
args- the normalized strings to convert toString- Returns:
- the original
Strings of all input normalized strings.
-
toTreeSet
public static java.util.TreeSet<NormalizedString> toTreeSet(java.lang.String... args)
Converts multiple plain strings into aTreeSetofNormalizedString.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toTreeSet
public static java.util.TreeSet<NormalizedString> toTreeSet(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into aTreeSetofNormalizedString.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toTreeSetOfStrings
public static java.util.TreeSet<java.lang.String> toTreeSetOfStrings(NormalizedString... args)
Converts multiple normalized strings into aHashSetofString.- Parameters:
args- the normalized strings to convert toString- Returns:
- the original
Strings of all input normalized strings.
-
toTreeSetOfStrings
public static java.util.TreeSet<java.lang.String> toTreeSetOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aHashSetofString.- Parameters:
args- the normalized strings to convert toString- Returns:
- the original
Strings of all input normalized strings.
-
toHashSet
public static java.util.HashSet<NormalizedString> toHashSet(java.lang.String... args)
Converts multiple plain strings into aHashSetofNormalizedString.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toHashSet
public static java.util.HashSet<NormalizedString> toHashSet(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into aHashSetofNormalizedString.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toHashSetOfStrings
public static java.util.HashSet<java.lang.String> toHashSetOfStrings(NormalizedString... args)
Converts multiple normalized strings into aHashSetofString.- Parameters:
args- the normalized strings to convert toString- Returns:
- the original
Strings of all input normalized strings.
-
toHashSetOfStrings
public static java.util.HashSet<java.lang.String> toHashSetOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aHashSetofString.- Parameters:
args- the normalized strings to convert toString- Returns:
- the original
Strings of all input normalized strings.
-
toLinkedHashSet
public static java.util.LinkedHashSet<NormalizedString> toLinkedHashSet(java.lang.String... args)
Converts multiple plain strings into aLinkedHashSetofNormalizedString.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toLinkedHashSet
public static java.util.LinkedHashSet<NormalizedString> toLinkedHashSet(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into aLinkedHashSetofNormalizedString.- Parameters:
args- the strings to convert toNormalizedString- Returns:
- the
NormalizedStringrepresentations of all input strings.
-
toLinkedHashSetOfStrings
public static java.util.LinkedHashSet<java.lang.String> toLinkedHashSetOfStrings(NormalizedString... args)
Converts multiple normalized strings into aLinkedHashSetofString.- Parameters:
args- the normalized strings to convert toString- Returns:
- the original
Strings of all input normalized strings.
-
toLinkedHashSetOfStrings
public static java.util.LinkedHashSet<java.lang.String> toLinkedHashSetOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aLinkedHashSetofString.- Parameters:
args- the normalized strings to convert toString- Returns:
- the original
Strings of all input normalized strings.
-
toLiteral
public NormalizedString toLiteral()
Returns the literal representation of thisNormalizedString, meaning it will only match with otherStringorNormalizedStringif they have the exact same content including character case and surrounding whitespaces.- Returns:
- the literal representation of the current
NormalizedString
-
toIdentifierGroupArray
public static NormalizedString[] toIdentifierGroupArray(NormalizedString[] strings)
Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes. Any clashing entries will be converted to their literal counterparts (usingtoLiteral()), making it possible to identify one from the other.- Parameters:
strings- a group of identifiers that may contain ambiguous entries if their character case or surrounding whitespaces is not considered. This array will be modified.- Returns:
- the input string array, with
NormalizedStringliterals in the positions where clashes would originally occur.
-
toIdentifierGroupArray
public static NormalizedString[] toIdentifierGroupArray(java.lang.String[] strings)
Analyzes a group of String to identify any instances whose normalized content will generate clashes. Any clashing entries will be converted to their literal counterparts (usingtoLiteral()), making it possible to identify one from the other.- Parameters:
strings- a group of identifiers that may contain ambiguous entries if their character case or surrounding whitespaces is not considered.- Returns:
- a
NormalizedStringarray with literals in the positions where clashes would originally occur.
-
identifyLiterals
public static boolean identifyLiterals(NormalizedString[] strings)
Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes. Any clashing entries will be converted to their literal counterparts (usingtoLiteral()), making it possible to identify one from the other.- Parameters:
strings- a group of identifiers that may contain ambiguous entries if their character case or surrounding whitespaces is not considered. This array will be modified.- Returns:
trueif any entry has been modified to be a literal, otherwisefalse
-
identifyLiterals
public static boolean identifyLiterals(NormalizedString[] strings, boolean lowercaseIdentifiers, boolean uppercaseIdentifiers)
Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes. Any clashing entries will be converted to their literal counterparts (usingtoLiteral()), making it possible to identify one from the other.- Parameters:
strings- a group of identifiers that may contain ambiguous entries if their character case or surrounding whitespaces is not considered. This array will be modified.lowercaseIdentifiers- flag indicating that identifiers are stored in lower case (for compatibility with databases). If a string has a uppercase character, it means it must become a literal.uppercaseIdentifiers- flag indicating that identifiers are stored in upper case (for compatibility with databases). If a string has a lowercase character, it means it must become a literal.- Returns:
trueif any entry has been modified to be a literal, otherwisefalse
-
getCache
public static StringCache<NormalizedString> getCache()
Returns the internal string cache to allow users to tweak its size limit or clear it when appropriate- Returns:
- the string cache used to store
NormalizedStringinstances associated with their originalString.
-
-