Package org.jcodings.unicode
Class UnicodeEncoding
- java.lang.Object
-
- org.jcodings.Encoding
-
- org.jcodings.AbstractEncoding
-
- org.jcodings.MultiByteEncoding
-
- org.jcodings.unicode.UnicodeEncoding
-
- All Implemented Interfaces:
java.lang.Cloneable
- Direct Known Subclasses:
BaseUTF8Encoding,CESU8Encoding,FixedWidthUnicodeEncoding,UTF16BEEncoding,UTF16LEEncoding
public abstract class UnicodeEncoding extends MultiByteEncoding
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static classUnicodeEncoding.CaseFoldprivate static classUnicodeEncoding.CaseMappingSpecialsprivate static classUnicodeEncoding.CaseUnfold11private static classUnicodeEncoding.CaseUnfold12private static classUnicodeEncoding.CaseUnfold13private static classUnicodeEncoding.CodeList(package private) static classUnicodeEncoding.CTypeName
-
Field Summary
Fields Modifier and Type Field Description (package private) static intCASE_MAPPING_SLACK(package private) static intDOT_ABOVE(package private) static intDOTLESS_i(package private) static intI_WITH_DOT_ABOVEprivate static intPROPERTY_NAME_MAX_SIZE(package private) static short[]UNICODE_ISO_8859_1_CTypeTable
-
Constructor Summary
Constructors Modifier Constructor Description protectedUnicodeEncoding(java.lang.String name, int minLength, int maxLength, int[] EncLen)protectedUnicodeEncoding(java.lang.String name, int minLength, int maxLength, int[] EncLen, int[][] Trans)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidapplyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, java.lang.Object arg)onigenc_ascii_apply_all_case_fold / used also by multibyte encodingsCaseFoldCodeItem[]caseFoldCodesByString(int flag, byte[] bytes, int p, int end)onigenc_ascii_get_case_fold_codes_by_str / used also by multibyte encodingsintcaseMap(IntHolder flagP, byte[] bytes, IntHolder pp, int end, byte[] to, int toP, int toEnd)Oniguruma equivalent:case_mapprotected int[]ctypeCodeRange(int ctype)private static intextractCode(int packed)private static intextractLength(int packed)java.lang.StringgetCharsetName()The name of the equivalent Java Charset for this encoding.booleanisCodeCType(int code, int ctype)Perform a check whether given code is of given character type (e.g.static booleanisInCodeRange(UnicodeCodeRange range, int code)intmbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold)onigenc_ascii_mbc_case_foldintpropertyNameToCType(byte[] name, int p, int end)onigenc_minimum_property_name_to_ctype notably overridden by unicode encodingsprivate static java.lang.Object[]readFoldN(int fromSize, java.lang.String table)-
Methods inherited from class org.jcodings.MultiByteEncoding
isInRange, length, lengthForTwoUptoFour, mb2CodeToMbc, mb2CodeToMbcLength, mb2IsCodeCType, mb4CodeToMbc, mb4CodeToMbcLength, mb4IsCodeCType, mbnMbcCaseFold, mbnMbcToCode, missing, missing, safeLengthForUptoFour, safeLengthForUptoThree, safeLengthForUptoTwo, strCodeAt, strLength
-
Methods inherited from class org.jcodings.AbstractEncoding
asciiApplyAllCaseFold, asciiCaseFoldCodesByString, asciiMbcCaseFold, isCodeCTypeInternal, isNewLine
-
Methods inherited from class org.jcodings.Encoding
asciiToLower, asciiToUpper, codeToMbc, codeToMbcLength, ctypeCodeRange, digitVal, equals, getCharset, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isReverseMatchAllowed, isSbWord, isSingleByte, isSpace, isUnicode, isUpper, isUTF8, isWord, isWordGraphPrint, isXDigit, leftAdjustCharHead, length, load, load, maxLength, maxLengthDistance, mbcodeStartPosition, mbcToCode, minLength, odigitVal, prevCharHead, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setDummy, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitVal
-
-
-
-
Field Detail
-
PROPERTY_NAME_MAX_SIZE
private static final int PROPERTY_NAME_MAX_SIZE
- See Also:
- Constant Field Values
-
I_WITH_DOT_ABOVE
static final int I_WITH_DOT_ABOVE
- See Also:
- Constant Field Values
-
DOTLESS_i
static final int DOTLESS_i
- See Also:
- Constant Field Values
-
DOT_ABOVE
static final int DOT_ABOVE
- See Also:
- Constant Field Values
-
CASE_MAPPING_SLACK
static final int CASE_MAPPING_SLACK
- See Also:
- Constant Field Values
-
UNICODE_ISO_8859_1_CTypeTable
static final short[] UNICODE_ISO_8859_1_CTypeTable
-
-
Method Detail
-
getCharsetName
public java.lang.String getCharsetName()
Description copied from class:EncodingThe name of the equivalent Java Charset for this encoding. Defaults to the name of the encoding. Subclasses can override this to provide a different name.- Overrides:
getCharsetNamein classEncoding- Returns:
- the name of the equivalent Java Charset for this encoding
-
isCodeCType
public boolean isCodeCType(int code, int ctype)Description copied from class:EncodingPerform a check whether given code is of given character type (e.g. used by isWord(someByte) and similar methods)- Specified by:
isCodeCTypein classEncoding- Parameters:
code- a code point of a characterctype- a character type to check against Oniguruma equivalent:is_code_ctype
-
isInCodeRange
public static boolean isInCodeRange(UnicodeCodeRange range, int code)
-
ctypeCodeRange
protected final int[] ctypeCodeRange(int ctype)
-
propertyNameToCType
public int propertyNameToCType(byte[] name, int p, int end)Description copied from class:AbstractEncodingonigenc_minimum_property_name_to_ctype notably overridden by unicode encodings- Overrides:
propertyNameToCTypein classAbstractEncoding
-
mbcCaseFold
public int mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold)Description copied from class:AbstractEncodingonigenc_ascii_mbc_case_fold- Overrides:
mbcCaseFoldin classAbstractEncoding- Parameters:
flag- case fold flagpp- anIntHolderthat points at character headfold- a buffer where to extract case folded character Oniguruma equivalent:mbc_case_fold
-
applyAllCaseFold
public void applyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, java.lang.Object arg)Description copied from class:AbstractEncodingonigenc_ascii_apply_all_case_fold / used also by multibyte encodings- Overrides:
applyAllCaseFoldin classAbstractEncoding- Parameters:
flag- case fold flagfun- case folding functor (look at:ApplyCaseFold)arg- case folding functor argument (look at:ApplyCaseFoldArg) Oniguruma equivalent:apply_all_case_fold
-
caseFoldCodesByString
public CaseFoldCodeItem[] caseFoldCodesByString(int flag, byte[] bytes, int p, int end)
Description copied from class:AbstractEncodingonigenc_ascii_get_case_fold_codes_by_str / used also by multibyte encodings- Overrides:
caseFoldCodesByStringin classAbstractEncoding
-
caseMap
public final int caseMap(IntHolder flagP, byte[] bytes, IntHolder pp, int end, byte[] to, int toP, int toEnd)
Description copied from class:EncodingOniguruma equivalent:case_map- Overrides:
caseMapin classMultiByteEncoding
-
readFoldN
private static java.lang.Object[] readFoldN(int fromSize, java.lang.String table)
-
extractLength
private static int extractLength(int packed)
-
extractCode
private static int extractCode(int packed)
-
-