Class Charsets


  • public class Charsets
    extends java.lang.Object
    Utility methods for charsets. See kala.compress.archivers.zip.ZipEncoding
    Since:
    1.21.0.1
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private static char[] HEX_CHARS  
      private static java.nio.charset.Charset NATIVE_CHARSET  
    • Constructor Summary

      Constructors 
      Constructor Description
      Charsets()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static boolean canEncode​(java.nio.charset.Charset charset, java.lang.String name)
      Check, whether the given string may be losslessly encoded using this encoding.
      static java.lang.String decode​(java.nio.charset.Charset charset, byte[] data)
      Use this method instead of org.apache.commons.compress.archivers.zip.ZipEncoding#decode(byte[])
      private static java.nio.charset.CharsetDecoder decoderFor​(java.nio.charset.Charset charset)  
      static java.nio.ByteBuffer encode​(java.nio.charset.Charset charset, java.lang.String name)
      Encode a file name or a comment to a byte array suitable for storing it to a serialized zip entry.
      private static java.nio.ByteBuffer encodeFully​(java.nio.charset.CharsetEncoder enc, java.nio.CharBuffer cb, java.nio.ByteBuffer out)  
      private static java.nio.charset.CharsetEncoder encoderFor​(java.nio.charset.Charset charset)  
      private static java.nio.CharBuffer encodeSurrogate​(java.nio.CharBuffer cb, char c)  
      private static int estimateIncrementalEncodingSize​(java.nio.charset.CharsetEncoder enc, int charCount)
      Estimate the size needed for remaining characters
      private static int estimateInitialBufferSize​(java.nio.charset.CharsetEncoder enc, int charChount)
      Estimate the initial encoded size (in bytes) for a character buffer.
      private static java.nio.ByteBuffer growBufferBy​(java.nio.ByteBuffer buffer, int increment)  
      static boolean isUTF8​(java.nio.charset.Charset charset)
      Tests whether a given encoding is UTF-8 or null.
      static java.nio.charset.Charset nativeCharset()
      Returns the platform default charset.
      static java.nio.charset.Charset toCharset​(java.lang.String name)
      Returns a charset object for the named charset.
      static java.nio.charset.Charset toCharset​(java.lang.String name, java.nio.charset.Charset defaultCharset)
      Returns a charset object for the named charset.
      static java.nio.charset.Charset toCharset​(java.nio.charset.Charset charset)
      Returns the given charset or the UTF-8 if the given charset is null.
      static java.nio.charset.Charset toCharset​(java.nio.charset.Charset charset, java.nio.charset.Charset defaultCharset)
      Returns the given charset or the UTF-8 if the given charset is null.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • HEX_CHARS

        private static final char[] HEX_CHARS
      • NATIVE_CHARSET

        private static final java.nio.charset.Charset NATIVE_CHARSET
    • Constructor Detail

      • Charsets

        public Charsets()
    • Method Detail

      • nativeCharset

        public static java.nio.charset.Charset nativeCharset()
        Returns the platform default charset. Users can override it by setting the system property 'kala.compress.native.charset'.
        Since:
        1.21.0.1
      • isUTF8

        public static boolean isUTF8​(java.nio.charset.Charset charset)
        Tests whether a given encoding is UTF-8 or null.
        Since:
        1.27.1-0
      • toCharset

        public static java.nio.charset.Charset toCharset​(java.lang.String name)
        Returns a charset object for the named charset. Use this method instead of kala.compress.archivers.zip.ZipEncodingHelper#getZipEncoding(String)
        Parameters:
        name - The name of the encoding. Specify null for the UTF-8.
        Returns:
        A charset object for the named encoding
        Throws:
        java.nio.charset.IllegalCharsetNameException - If the given charset name is illegal
        java.nio.charset.UnsupportedCharsetException - If no support for the named charset is available in this instance of the Java virtual machine
      • toCharset

        public static java.nio.charset.Charset toCharset​(java.lang.String name,
                                                         java.nio.charset.Charset defaultCharset)
        Returns a charset object for the named charset. If the requested character set cannot be found, defaultCharset will be used instead.
        Parameters:
        name - The name of the encoding. Specify null for the UTF-8.
        Returns:
        A charset object for the named encoding
      • toCharset

        public static java.nio.charset.Charset toCharset​(java.nio.charset.Charset charset)
        Returns the given charset or the UTF-8 if the given charset is null.
        Parameters:
        charset - A charset or null.
        Returns:
        the given Charset or the UTF-8 if the given Charset is null
      • toCharset

        public static java.nio.charset.Charset toCharset​(java.nio.charset.Charset charset,
                                                         java.nio.charset.Charset defaultCharset)
        Returns the given charset or the UTF-8 if the given charset is null.
        Parameters:
        charset - A charset or null.
        Returns:
        the given Charset or the UTF-8 if the given Charset is null
      • canEncode

        public static boolean canEncode​(java.nio.charset.Charset charset,
                                        java.lang.String name)
        Check, whether the given string may be losslessly encoded using this encoding. Use this method instead of kala.compress.archivers.zip.ZipEncoding#canEncode(String)
        Parameters:
        name - A file name or ZIP comment.
        Returns:
        Whether the given name may be encoded with out any losses.
        See Also:
        CharsetEncoder.canEncode(CharSequence)
      • encode

        public static java.nio.ByteBuffer encode​(java.nio.charset.Charset charset,
                                                 java.lang.String name)
        Encode a file name or a comment to a byte array suitable for storing it to a serialized zip entry.

        Examples for CP 437 (in pseudo-notation, right hand side is C-style notation):

          encode("€_for_Dollar.txt") = "%U20AC_for_Dollar.txt"
          encode("Ölfässer.txt") = "\231lf\204sser.txt"
         
        Use this method instead of org.apache.commons.compress.archivers.zip.ZipEncoding#encode(String)
        Parameters:
        name - A file name or ZIP comment.
        Returns:
        A byte buffer with a backing array containing the encoded name. Unmappable characters or malformed character sequences are mapped to a sequence of utf-16 words encoded in the format %Uxxxx. It is assumed, that the byte buffer is positioned at the beginning of the encoded result, the byte buffer has a backing array and the limit of the byte buffer points to the end of the encoded result.
        Throws:
        java.io.IOException - on error
      • decode

        public static java.lang.String decode​(java.nio.charset.Charset charset,
                                              byte[] data)
                                       throws java.io.IOException
        Use this method instead of org.apache.commons.compress.archivers.zip.ZipEncoding#decode(byte[])
        Parameters:
        data - The byte values to decode.
        Returns:
        The decoded string.
        Throws:
        java.io.IOException - on error
      • growBufferBy

        private static java.nio.ByteBuffer growBufferBy​(java.nio.ByteBuffer buffer,
                                                        int increment)
      • encodeFully

        private static java.nio.ByteBuffer encodeFully​(java.nio.charset.CharsetEncoder enc,
                                                       java.nio.CharBuffer cb,
                                                       java.nio.ByteBuffer out)
      • encodeSurrogate

        private static java.nio.CharBuffer encodeSurrogate​(java.nio.CharBuffer cb,
                                                           char c)
      • encoderFor

        private static java.nio.charset.CharsetEncoder encoderFor​(java.nio.charset.Charset charset)
      • decoderFor

        private static java.nio.charset.CharsetDecoder decoderFor​(java.nio.charset.Charset charset)
      • estimateInitialBufferSize

        private static int estimateInitialBufferSize​(java.nio.charset.CharsetEncoder enc,
                                                     int charChount)
        Estimate the initial encoded size (in bytes) for a character buffer.

        The estimate assumes that one character consumes uses the maximum length encoding, whilst the rest use an average size encoding. This accounts for any BOM for UTF-16, at the expense of a couple of extra bytes for UTF-8 encoded ASCII.

        Parameters:
        enc - encoder to use for estimates
        charChount - number of characters in string
        Returns:
        estimated size in bytes.
      • estimateIncrementalEncodingSize

        private static int estimateIncrementalEncodingSize​(java.nio.charset.CharsetEncoder enc,
                                                           int charCount)
        Estimate the size needed for remaining characters
        Parameters:
        enc - encoder to use for estimates
        charCount - number of characters remaining
        Returns:
        estimated size in bytes.