Class Utf8.DecodeUtil

  • Enclosing class:
    Utf8

    static class Utf8.DecodeUtil
    extends java.lang.Object
    Utility methods for decoding bytes into String. Callers are responsible for extracting bytes (possibly using Unsafe methods), and checking remaining bytes. All other UTF-8 validity checks and codepoint conversion happen in this class.
    • Constructor Summary

      Constructors 
      Constructor Description
      DecodeUtil()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) static void handleFourBytes​(byte byte1, byte byte2, byte byte3, byte byte4, char[] resultArr, int resultPos)  
      (package private) static void handleOneByte​(byte byte1, char[] resultArr, int resultPos)  
      (package private) static void handleThreeBytes​(byte byte1, byte byte2, byte byte3, char[] resultArr, int resultPos)  
      (package private) static void handleTwoBytes​(byte byte1, byte byte2, char[] resultArr, int resultPos)  
      private static char highSurrogate​(int codePoint)  
      private static boolean isNotTrailingByte​(byte b)
      Returns whether the byte is not a valid continuation of the form '10XXXXXX'.
      (package private) static boolean isOneByte​(byte b)
      Returns whether this is a single-byte codepoint (i.e., ASCII) with the form '0XXXXXXX'.
      (package private) static boolean isThreeBytes​(byte b)
      Returns whether this is a three-byte codepoint with the form '110XXXXX'.
      (package private) static boolean isTwoBytes​(byte b)
      Returns whether this is a two-byte codepoint with the form '10XXXXXX'.
      private static char lowSurrogate​(int codePoint)  
      private static int trailingByteValue​(byte b)
      Returns the actual value of the trailing byte (removes the prefix '10') for composition.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • DecodeUtil

        DecodeUtil()
    • Method Detail

      • isOneByte

        static boolean isOneByte​(byte b)
        Returns whether this is a single-byte codepoint (i.e., ASCII) with the form '0XXXXXXX'.
      • isTwoBytes

        static boolean isTwoBytes​(byte b)
        Returns whether this is a two-byte codepoint with the form '10XXXXXX'.
      • isThreeBytes

        static boolean isThreeBytes​(byte b)
        Returns whether this is a three-byte codepoint with the form '110XXXXX'.
      • handleOneByte

        static void handleOneByte​(byte byte1,
                                  char[] resultArr,
                                  int resultPos)
      • handleTwoBytes

        static void handleTwoBytes​(byte byte1,
                                   byte byte2,
                                   char[] resultArr,
                                   int resultPos)
                            throws java.lang.IllegalArgumentException
        Throws:
        java.lang.IllegalArgumentException
      • handleThreeBytes

        static void handleThreeBytes​(byte byte1,
                                     byte byte2,
                                     byte byte3,
                                     char[] resultArr,
                                     int resultPos)
                              throws java.lang.IllegalArgumentException
        Throws:
        java.lang.IllegalArgumentException
      • handleFourBytes

        static void handleFourBytes​(byte byte1,
                                    byte byte2,
                                    byte byte3,
                                    byte byte4,
                                    char[] resultArr,
                                    int resultPos)
                             throws java.lang.IllegalArgumentException
        Throws:
        java.lang.IllegalArgumentException
      • isNotTrailingByte

        private static boolean isNotTrailingByte​(byte b)
        Returns whether the byte is not a valid continuation of the form '10XXXXXX'.
      • trailingByteValue

        private static int trailingByteValue​(byte b)
        Returns the actual value of the trailing byte (removes the prefix '10') for composition.
      • highSurrogate

        private static char highSurrogate​(int codePoint)
      • lowSurrogate

        private static char lowSurrogate​(int codePoint)