java.lang.Object
kala.compress.utils.Charsets
Utility methods for charsets.
See
kala.compress.archivers.zip.ZipEncoding- Since:
- 1.21.0.1
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic booleanCheck, whether the given string may be losslessly encoded using this encoding.static StringUse this method instead oforg.apache.commons.compress.archivers.zip.ZipEncoding#decode(byte[])private static CharsetDecoderdecoderFor(Charset charset) static ByteBufferEncode a file name or a comment to a byte array suitable for storing it to a serialized zip entry.private static ByteBufferencodeFully(CharsetEncoder enc, CharBuffer cb, ByteBuffer out) private static CharsetEncoderencoderFor(Charset charset) private static CharBufferencodeSurrogate(CharBuffer cb, char c) private static intestimateIncrementalEncodingSize(CharsetEncoder enc, int charCount) Estimate the size needed for remaining charactersprivate static intestimateInitialBufferSize(CharsetEncoder enc, int charChount) Estimate the initial encoded size (in bytes) for a character buffer.private static ByteBuffergrowBufferBy(ByteBuffer buffer, int increment) static booleanTests whether a given encoding is UTF-8 or null.static CharsetReturns the platform default charset.static CharsetReturns a charset object for the named charset.static CharsetReturns a charset object for the named charset.static CharsetReturns the given charset or the UTF-8 if the given charset is null.static CharsetReturns the given charset or the UTF-8 if the given charset is null.
-
Field Details
-
HEX_CHARS
private static final char[] HEX_CHARS -
NATIVE_CHARSET
-
-
Constructor Details
-
Charsets
public Charsets()
-
-
Method Details
-
nativeCharset
Returns the platform default charset. Users can override it by setting the system property 'kala.compress.native.charset'.- Since:
- 1.21.0.1
-
isUTF8
Tests whether a given encoding is UTF-8 or null.- Since:
- 1.27.1-0
-
toCharset
Returns a charset object for the named charset. Use this method instead ofkala.compress.archivers.zip.ZipEncodingHelper#getZipEncoding(String)- Parameters:
name- The name of the encoding. Specify null for the UTF-8.- Returns:
- A charset object for the named encoding
- Throws:
IllegalCharsetNameException- If the given charset name is illegalUnsupportedCharsetException- If no support for the named charset is available in this instance of the Java virtual machine
-
toCharset
Returns a charset object for the named charset. If the requested character set cannot be found,defaultCharsetwill be used instead.- Parameters:
name- The name of the encoding. Specify null for the UTF-8.- Returns:
- A charset object for the named encoding
-
toCharset
Returns the given charset or the UTF-8 if the given charset is null.- Parameters:
charset- A charset or null.- Returns:
- the given Charset or the UTF-8 if the given Charset is null
-
toCharset
Returns the given charset or the UTF-8 if the given charset is null.- Parameters:
charset- A charset or null.- Returns:
- the given Charset or the UTF-8 if the given Charset is null
-
canEncode
Check, whether the given string may be losslessly encoded using this encoding. Use this method instead ofkala.compress.archivers.zip.ZipEncoding#canEncode(String)- Parameters:
name- A file name or ZIP comment.- Returns:
- Whether the given name may be encoded with out any losses.
- See Also:
-
encode
Encode a file name or a comment to a byte array suitable for storing it to a serialized zip entry.Examples for CP 437 (in pseudo-notation, right hand side is C-style notation):
encode("€_for_Dollar.txt") = "%U20AC_for_Dollar.txt" encode("Ölfässer.txt") = "\231lf\204sser.txt"Use this method instead oforg.apache.commons.compress.archivers.zip.ZipEncoding#encode(String)- Parameters:
name- A file name or ZIP comment.- Returns:
- A byte buffer with a backing array containing the
encoded name. Unmappable characters or malformed
character sequences are mapped to a sequence of utf-16
words encoded in the format
%Uxxxx. It is assumed, that the byte buffer is positioned at the beginning of the encoded result, the byte buffer has a backing array and the limit of the byte buffer points to the end of the encoded result. - Throws:
IOException- on error
-
decode
Use this method instead oforg.apache.commons.compress.archivers.zip.ZipEncoding#decode(byte[])- Parameters:
data- The byte values to decode.- Returns:
- The decoded string.
- Throws:
IOException- on error
-
growBufferBy
-
encodeFully
-
encodeSurrogate
-
encoderFor
-
decoderFor
-
estimateInitialBufferSize
Estimate the initial encoded size (in bytes) for a character buffer.The estimate assumes that one character consumes uses the maximum length encoding, whilst the rest use an average size encoding. This accounts for any BOM for UTF-16, at the expense of a couple of extra bytes for UTF-8 encoded ASCII.
- Parameters:
enc- encoder to use for estimatescharChount- number of characters in string- Returns:
- estimated size in bytes.
-
estimateIncrementalEncodingSize
Estimate the size needed for remaining characters- Parameters:
enc- encoder to use for estimatescharCount- number of characters remaining- Returns:
- estimated size in bytes.
-