Package org.jcodings.specific
Class BaseUTF8Encoding
- java.lang.Object
-
- org.jcodings.Encoding
-
- org.jcodings.AbstractEncoding
-
- org.jcodings.MultiByteEncoding
-
- org.jcodings.unicode.UnicodeEncoding
-
- org.jcodings.specific.BaseUTF8Encoding
-
- All Implemented Interfaces:
java.lang.Cloneable
- Direct Known Subclasses:
NonStrictUTF8Encoding,UTF8Encoding
abstract class BaseUTF8Encoding extends UnicodeEncoding
-
-
Field Summary
Fields Modifier and Type Field Description private static intINVALID_CODE_FEprivate static intINVALID_CODE_FF(package private) static booleanUSE_INVALID_CODE_SCHEMEprivate static intVALID_CODE_LIMIT
-
Constructor Summary
Constructors Modifier Constructor Description protectedBaseUTF8Encoding(int[] EncLen, int[][] Trans)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description intcodeToMbc(int code, byte[] bytes, int p)Extracts code point into it's multibyte representationintcodeToMbcLength(int code)Returns character length given a code point Oniguruma equivalent:code_to_mbclenint[]ctypeCodeRange(int ctype, IntHolder sbOut)utf8_get_ctype_code_rangejava.lang.StringgetCharsetName()The name of the equivalent Java Charset for this encoding.booleanisNewLine(byte[] bytes, int p, int end)onigenc_is_mbc_newline_0x0a / used also by multibyte encodingsbooleanisReverseMatchAllowed(byte[] bytes, int p, int end)onigenc_always_true_is_allowed_reverse_matchintleftAdjustCharHead(byte[] bytes, int p, int s, int end)utf8_left_adjust_char_headintmbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold)onigenc_ascii_mbc_case_foldintmbcToCode(byte[] bytes, int p, int end)Returns code point for a character Oniguruma equivalent:mbc_to_code(package private) static bytetrail0(int code)(package private) static bytetrailS(int code, int shift)private static booleanutf8IsLead(int c)-
Methods inherited from class org.jcodings.unicode.UnicodeEncoding
applyAllCaseFold, caseFoldCodesByString, caseMap, ctypeCodeRange, isCodeCType, isInCodeRange, propertyNameToCType
-
Methods inherited from class org.jcodings.MultiByteEncoding
isInRange, length, lengthForTwoUptoFour, mb2CodeToMbc, mb2CodeToMbcLength, mb2IsCodeCType, mb4CodeToMbc, mb4CodeToMbcLength, mb4IsCodeCType, mbnMbcCaseFold, mbnMbcToCode, missing, missing, safeLengthForUptoFour, safeLengthForUptoThree, safeLengthForUptoTwo, strCodeAt, strLength
-
Methods inherited from class org.jcodings.AbstractEncoding
asciiApplyAllCaseFold, asciiCaseFoldCodesByString, asciiMbcCaseFold, isCodeCTypeInternal
-
Methods inherited from class org.jcodings.Encoding
asciiToLower, asciiToUpper, digitVal, equals, getCharset, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isSbWord, isSingleByte, isSpace, isUnicode, isUpper, isUTF8, isWord, isWordGraphPrint, isXDigit, length, load, load, maxLength, maxLengthDistance, mbcodeStartPosition, minLength, odigitVal, prevCharHead, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setDummy, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitVal
-
-
-
-
Field Detail
-
USE_INVALID_CODE_SCHEME
static final boolean USE_INVALID_CODE_SCHEME
- See Also:
- Constant Field Values
-
INVALID_CODE_FE
private static final int INVALID_CODE_FE
- See Also:
- Constant Field Values
-
INVALID_CODE_FF
private static final int INVALID_CODE_FF
- See Also:
- Constant Field Values
-
VALID_CODE_LIMIT
private static final int VALID_CODE_LIMIT
- See Also:
- Constant Field Values
-
-
Method Detail
-
getCharsetName
public java.lang.String getCharsetName()
Description copied from class:EncodingThe name of the equivalent Java Charset for this encoding. Defaults to the name of the encoding. Subclasses can override this to provide a different name.- Overrides:
getCharsetNamein classUnicodeEncoding- Returns:
- the name of the equivalent Java Charset for this encoding
-
isNewLine
public boolean isNewLine(byte[] bytes, int p, int end)Description copied from class:AbstractEncodingonigenc_is_mbc_newline_0x0a / used also by multibyte encodings- Overrides:
isNewLinein classAbstractEncoding
-
codeToMbcLength
public int codeToMbcLength(int code)
Description copied from class:EncodingReturns character length given a code point Oniguruma equivalent:code_to_mbclen- Specified by:
codeToMbcLengthin classEncoding
-
mbcToCode
public int mbcToCode(byte[] bytes, int p, int end)Description copied from class:EncodingReturns code point for a character Oniguruma equivalent:mbc_to_code
-
trailS
static byte trailS(int code, int shift)
-
trail0
static byte trail0(int code)
-
codeToMbc
public int codeToMbc(int code, byte[] bytes, int p)Description copied from class:EncodingExtracts code point into it's multibyte representation
-
mbcCaseFold
public int mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold)Description copied from class:AbstractEncodingonigenc_ascii_mbc_case_fold- Overrides:
mbcCaseFoldin classUnicodeEncoding- Parameters:
flag- case fold flagpp- anIntHolderthat points at character headfold- a buffer where to extract case folded character Oniguruma equivalent:mbc_case_fold
-
ctypeCodeRange
public int[] ctypeCodeRange(int ctype, IntHolder sbOut)utf8_get_ctype_code_range- Specified by:
ctypeCodeRangein classEncoding
-
utf8IsLead
private static boolean utf8IsLead(int c)
-
leftAdjustCharHead
public int leftAdjustCharHead(byte[] bytes, int p, int s, int end)utf8_left_adjust_char_head- Specified by:
leftAdjustCharHeadin classEncoding- Parameters:
bytes- byte streamp- positions- stopend- end
-
isReverseMatchAllowed
public boolean isReverseMatchAllowed(byte[] bytes, int p, int end)onigenc_always_true_is_allowed_reverse_match- Specified by:
isReverseMatchAllowedin classEncoding
-
-