Package org.jcodings.specific
Class GB18030Encoding
- java.lang.Object
-
- org.jcodings.Encoding
-
- org.jcodings.AbstractEncoding
-
- org.jcodings.MultiByteEncoding
-
- org.jcodings.specific.GB18030Encoding
-
- All Implemented Interfaces:
java.lang.Cloneable
public final class GB18030Encoding extends MultiByteEncoding
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static classGB18030Encoding.State
-
Field Summary
Fields Modifier and Type Field Description private static intC1private static intC2private static intC4private static intCMprivate static java.lang.StringGB18030private static int[]GB18030_MAPprivate static int[][]GB18030Transstatic GB18030EncodingINSTANCE
-
Constructor Summary
Constructors Modifier Constructor Description protectedGB18030Encoding()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description intcodeToMbc(int code, byte[] bytes, int p)Extracts code point into it's multibyte representationintcodeToMbcLength(int code)Returns character length given a code point Oniguruma equivalent:code_to_mbclenint[]ctypeCodeRange(int ctype, IntHolder sbOut)Returns code range for a given character type Oniguruma equivalent:get_ctype_code_rangejava.lang.StringgetCharsetName()The name of the equivalent Java Charset for this encoding.booleanisCodeCType(int code, int ctype)Perform a check whether given code is of given character type (e.g.booleanisReverseMatchAllowed(byte[] bytes, int p, int end)Returns true if it's safe to use reversal Boyer-Moore search fail fast algorithm Oniguruma equivalent:is_allowed_reverse_matchintleftAdjustCharHead(byte[] bytes, int start, int s, int end)Seeks the previous character head in a stream Oniguruma equivalent:left_adjust_char_headintlength(byte[] bytes, int p, int end)Returns character length given stream, character position and stream end returns1for singlebyte encodings or performs sanity validations for multibyte ones and returns the character length, missing characters in the stream otherwiseprivate intlengthForThreeUptoFour(byte[] bytes, int p, int end, int s)private intlengthForTwoUptoFour(byte[] bytes, int p, int end, int s)intmbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)onigenc_ascii_mbc_case_foldintmbcToCode(byte[] bytes, int p, int end)Returns code point for a character Oniguruma equivalent:mbc_to_code-
Methods inherited from class org.jcodings.MultiByteEncoding
caseMap, isInRange, length, lengthForTwoUptoFour, mb2CodeToMbc, mb2CodeToMbcLength, mb2IsCodeCType, mb4CodeToMbc, mb4CodeToMbcLength, mb4IsCodeCType, mbnMbcCaseFold, mbnMbcToCode, missing, missing, safeLengthForUptoFour, safeLengthForUptoThree, safeLengthForUptoTwo, strCodeAt, strLength
-
Methods inherited from class org.jcodings.AbstractEncoding
applyAllCaseFold, asciiApplyAllCaseFold, asciiCaseFoldCodesByString, asciiMbcCaseFold, caseFoldCodesByString, isCodeCTypeInternal, isNewLine, propertyNameToCType
-
Methods inherited from class org.jcodings.Encoding
asciiToLower, asciiToUpper, digitVal, equals, getCharset, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isSbWord, isSingleByte, isSpace, isUnicode, isUpper, isUTF8, isWord, isWordGraphPrint, isXDigit, load, load, maxLength, maxLengthDistance, mbcodeStartPosition, minLength, odigitVal, prevCharHead, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setDummy, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitVal
-
-
-
-
Field Detail
-
GB18030
private static final java.lang.String GB18030
- See Also:
- Constant Field Values
-
C1
private static final int C1
- See Also:
- Constant Field Values
-
C2
private static final int C2
- See Also:
- Constant Field Values
-
C4
private static final int C4
- See Also:
- Constant Field Values
-
CM
private static final int CM
- See Also:
- Constant Field Values
-
GB18030_MAP
private static final int[] GB18030_MAP
-
GB18030Trans
private static final int[][] GB18030Trans
-
INSTANCE
public static final GB18030Encoding INSTANCE
-
-
Method Detail
-
length
public int length(byte[] bytes, int p, int end)Description copied from class:EncodingReturns character length given stream, character position and stream end returns1for singlebyte encodings or performs sanity validations for multibyte ones and returns the character length, missing characters in the stream otherwise
-
lengthForTwoUptoFour
private int lengthForTwoUptoFour(byte[] bytes, int p, int end, int s)
-
lengthForThreeUptoFour
private int lengthForThreeUptoFour(byte[] bytes, int p, int end, int s)
-
mbcToCode
public int mbcToCode(byte[] bytes, int p, int end)Description copied from class:EncodingReturns code point for a character Oniguruma equivalent:mbc_to_code
-
codeToMbcLength
public int codeToMbcLength(int code)
Description copied from class:EncodingReturns character length given a code point Oniguruma equivalent:code_to_mbclen- Specified by:
codeToMbcLengthin classEncoding
-
codeToMbc
public int codeToMbc(int code, byte[] bytes, int p)Description copied from class:EncodingExtracts code point into it's multibyte representation
-
mbcCaseFold
public int mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] lower)Description copied from class:AbstractEncodingonigenc_ascii_mbc_case_fold- Overrides:
mbcCaseFoldin classAbstractEncoding- Parameters:
flag- case fold flagpp- anIntHolderthat points at character headlower- a buffer where to extract case folded character Oniguruma equivalent:mbc_case_fold
-
isCodeCType
public boolean isCodeCType(int code, int ctype)Description copied from class:EncodingPerform a check whether given code is of given character type (e.g. used by isWord(someByte) and similar methods)- Specified by:
isCodeCTypein classEncoding- Parameters:
code- a code point of a characterctype- a character type to check against Oniguruma equivalent:is_code_ctype
-
ctypeCodeRange
public int[] ctypeCodeRange(int ctype, IntHolder sbOut)Description copied from class:EncodingReturns code range for a given character type Oniguruma equivalent:get_ctype_code_range- Specified by:
ctypeCodeRangein classEncoding
-
getCharsetName
public java.lang.String getCharsetName()
Description copied from class:EncodingThe name of the equivalent Java Charset for this encoding. Defaults to the name of the encoding. Subclasses can override this to provide a different name.- Overrides:
getCharsetNamein classEncoding- Returns:
- the name of the equivalent Java Charset for this encoding
-
leftAdjustCharHead
public int leftAdjustCharHead(byte[] bytes, int start, int s, int end)Description copied from class:EncodingSeeks the previous character head in a stream Oniguruma equivalent:left_adjust_char_head- Specified by:
leftAdjustCharHeadin classEncoding- Parameters:
bytes- byte streamstart- positions- stopend- end
-
isReverseMatchAllowed
public boolean isReverseMatchAllowed(byte[] bytes, int p, int end)Description copied from class:EncodingReturns true if it's safe to use reversal Boyer-Moore search fail fast algorithm Oniguruma equivalent:is_allowed_reverse_match- Specified by:
isReverseMatchAllowedin classEncoding
-
-