Class StringTool
java.lang.Object
net.sf.saxon.str.StringTool
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic voidappendRepeated(StringBuilder builder, char ch, int count) Insert repeated occurrences of a given character at the end of a StringBuilderstatic IntIteratorcodePoints(CharSequence value) static UnicodeStringcompress(char[] in, int offset, int len, boolean compressWS) Attempt to compress a UnicodeString consisting entirely of whitespace.static booleancontainsSurrogates(String str) Ask whether a string contains astral characters (represented as surrogate pairs)static voidcopy16to24(char[] source, int sourcePos, byte[] dest, int destPos, int count) Copy from an array of 16-bit characters to an array holding 16-bit characters.static voidcopy8to16(byte[] source, int sourcePos, char[] dest, int destPos, int count) Copy from an array of 8-bit characters to an array holding 16-bit characters.static voidcopy8to24(byte[] source, int sourcePos, byte[] dest, int destPos, int count) Copy from an array of 8-bit characters to an array holding 24-bit characters, organised as three bytes per character The caller is responsible for ensuring that the offsets are in range and that the destination array is large enough.static StringProduce a diagnostic representation of the contents of the stringstatic int[]Expand a string into an array of 32-bit charactersstatic UnicodeStringfromCharSequence(CharSequence chars) static UnicodeStringfromCodePoints(int[] codes, int used) Contract an array of integers containing Unicode codepoints into a stringstatic UnicodeStringfromLatin1(String str) static intGet the length of a string, as defined in XPath.static intGet the last codepoint in a UnicodeStringstatic longlastIndexOf(UnicodeString str, int codePoint) Get the position of the last occurrence of a given codepoint within a stringstatic voidprependRepeated(StringBuilder builder, char ch, int count) Insert repeated occurrences of a given character at the start of a StringBuilderstatic voidprependWideChar(StringBuilder builder, int ch) Insert a wide character (surrogate pair) at the start of a StringBuilderstatic intrequireInt(long value) Utility method for use where strings longer than 2^31 characters cannot yet be handled.
-
Constructor Details
-
StringTool
public StringTool()
-
-
Method Details
-
getStringLength
Get the length of a string, as defined in XPath. This is not the same as the Java length, as a Unicode surrogate pair counts as a single character.- Parameters:
s- The string whose length is required- Returns:
- the length of the string in Unicode code points
-
expand
Expand a string into an array of 32-bit characters- Parameters:
s- the string to be expanded- Returns:
- an array of integers representing the Unicode code points
-
containsSurrogates
Ask whether a string contains astral characters (represented as surrogate pairs)- Parameters:
str- the string to be tested- Returns:
- true if the string contains surrogate characters
-
fromCodePoints
Contract an array of integers containing Unicode codepoints into a string- Parameters:
codes- an array of integers representing the Unicode code pointsused- the number of items in the array that are actually used- Returns:
- the constructed string
-
fromCharSequence
-
fromLatin1
-
codePoints
-
diagnosticDisplay
-
prependWideChar
Insert a wide character (surrogate pair) at the start of a StringBuilder- Parameters:
builder- the string builderch- the codepoint of the character to be inserted
-
prependRepeated
Insert repeated occurrences of a given character at the start of a StringBuilder- Parameters:
builder- the string builderch- the character to be insertedcount- the number of repetitions
-
appendRepeated
Insert repeated occurrences of a given character at the end of a StringBuilder- Parameters:
builder- the string builderch- the character to be insertedcount- the number of repetitions
-
lastCodePoint
Get the last codepoint in a UnicodeString- Parameters:
str- the input string- Returns:
- the integer value of the last character in the string
- Throws:
IndexOutOfBoundsException- if the string is empty
-
lastIndexOf
Get the position of the last occurrence of a given codepoint within a string- Parameters:
str- the input stringcodePoint- the sought codepoint- Returns:
- the zero-based position of the last occurrence of the codepoint within the input string, or -1 if the codepoint does not appear within the string
-
requireInt
public static int requireInt(long value) Utility method for use where strings longer than 2^31 characters cannot yet be handled.- Parameters:
value- the actual value of a character position within a string, or the length of a string- Returns:
- the value as an integer if it is within range
- Throws:
UnsupportedOperationException- if the supplied value exceedsInteger.MAX_VALUE
-
compress
Attempt to compress a UnicodeString consisting entirely of whitespace. This is the first thing we do to an incoming text node- Parameters:
in- the Unicode string to be compressedoffset- the start position of the substring we are interested inlen- the length of the substring we are interested incompressWS- set to true if whitespace compression is to be attempted- Returns:
- the compressed sequence if it can be compressed; or the uncompressed UnicodeString otherwise
-
copy8to16
public static void copy8to16(byte[] source, int sourcePos, char[] dest, int destPos, int count) Copy from an array of 8-bit characters to an array holding 16-bit characters. The caller is responsible for ensuring that the offsets are in range and that the destination array is large enough.- Parameters:
source- the source arraysourcePos- the position in the source array where copying is to startdest- the destination arraydestPos- the position in the destination array where copying is to startcount- the number of characters (codepoints) to copy
-
copy8to24
public static void copy8to24(byte[] source, int sourcePos, byte[] dest, int destPos, int count) Copy from an array of 8-bit characters to an array holding 24-bit characters, organised as three bytes per character The caller is responsible for ensuring that the offsets are in range and that the destination array is large enough.- Parameters:
source- the source arraysourcePos- the position in the source array where copying is to startdest- the destination array, using three bytes per codepointdestPos- the codepoint position (not byte position) in the destination array where copying is to startcount- the number of characters (codepoints) to copy
-
copy16to24
public static void copy16to24(char[] source, int sourcePos, byte[] dest, int destPos, int count) Copy from an array of 16-bit characters to an array holding 16-bit characters. The caller is responsible for ensuring that the offsets are in range and that the destination array is large enough.- Parameters:
source- the source array. The caller is responsible for ensuring that this contains no surrogatessourcePos- the position in the source array where copying is to startdest- the destination arraydestPos- the position in the destination array where copying is to startcount- the number of characters (codepoints) to copy
-