Package com.fasterxml.aalto.async
Class AsyncByteScanner
- java.lang.Object
-
- com.fasterxml.aalto.in.XmlScanner
-
- com.fasterxml.aalto.in.ByteBasedScanner
-
- com.fasterxml.aalto.async.AsyncByteScanner
-
- All Implemented Interfaces:
AsyncInputFeeder,XmlConsts,javax.xml.namespace.NamespaceContext,javax.xml.stream.XMLStreamConstants
- Direct Known Subclasses:
AsyncByteArrayScanner,AsyncByteBufferScanner
public abstract class AsyncByteScanner extends ByteBasedScanner implements AsyncInputFeeder
-
-
Field Summary
Fields Modifier and Type Field Description protected XmlCharTypes_charTypesThis is a simple container object that is used to access the decoding tables for characters.protected int_currQuadBytes parsed for the current, incomplete, quadprotected int_currQuadBytesNumber of bytes pending/buffered, stored in_currQuadprotected boolean_elemAllNsBoundprotected boolean_elemAttrCountprotected PName_elemAttrNameprotected int_elemAttrPtrPointer for the next character of currently being parsed value within attribute value bufferprotected byte_elemAttrQuoteprotected int_elemNsPtrPointer for the next character of currently being parsed namespace URI for the current namespace declarationprotected boolean_endOfInputFlag that is sent when calling application indicates that there will be no more input to parse.protected int_entityValueEntity value accumulated so farprotected boolean_inDtdDeclarationFlag that indicates whether we are inside a declaration during parsing of internal DTD subset.protected int_nextEventDue to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete.protected int_pendingInputThere are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs.protected int[]_quadBufferThis buffer is used for name parsing.protected int_quadCountNumber of complete quads parsed for current name (quads themselves are stored in_quadBuffer).protected int_stateIn addition to the event type, there is need for additional state informationprotected int_surroundingEventFor token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained.protected ByteBasedPNameTable_symbolsFor now, symbol table contains prefixed names.protected static intEVENT_INCOMPLETEprotected static intPENDING_STATE_ATTR_VALUE_AMPprotected static intPENDING_STATE_ATTR_VALUE_AMP_HASHprotected static intPENDING_STATE_ATTR_VALUE_AMP_HASH_Xprotected static intPENDING_STATE_ATTR_VALUE_DEC_DIGITprotected static intPENDING_STATE_ATTR_VALUE_ENTITY_NAMEprotected static intPENDING_STATE_ATTR_VALUE_HEX_DIGITprotected static intPENDING_STATE_CDATA_BRACKET1protected static intPENDING_STATE_CDATA_BRACKET2protected static intPENDING_STATE_COMMENT_HYPHEN1protected static intPENDING_STATE_COMMENT_HYPHEN2protected static intPENDING_STATE_CRprotected static intPENDING_STATE_ENT_IN_DEC_DIGITprotected static intPENDING_STATE_ENT_IN_HEX_DIGITprotected static intPENDING_STATE_ENT_SEEN_HASHprotected static intPENDING_STATE_ENT_SEEN_HASH_Xprotected static intPENDING_STATE_PI_QMARKprotected static intPENDING_STATE_TEXT_AMPprotected static intPENDING_STATE_TEXT_AMP_HASHprotected static intPENDING_STATE_TEXT_BRACKET1protected static intPENDING_STATE_TEXT_BRACKET2protected static intPENDING_STATE_TEXT_DEC_ENTITYprotected static intPENDING_STATE_TEXT_HEX_ENTITYprotected static intPENDING_STATE_TEXT_IN_ENTITYprotected static intPENDING_STATE_XMLDECL_LTprotected static intPENDING_STATE_XMLDECL_LTQprotected static intPENDING_STATE_XMLDECL_TARGETprotected static intSTATE_CDATA_Cprotected static intSTATE_CDATA_CDprotected static intSTATE_CDATA_CDAprotected static intSTATE_CDATA_CDATprotected static intSTATE_CDATA_CDATAprotected static intSTATE_CDATA_CONTENTprotected static intSTATE_COMMENT_CONTENTprotected static intSTATE_COMMENT_HYPHENprotected static intSTATE_COMMENT_HYPHEN2protected static intSTATE_DEFAULTDefault starting state for many events/contexts -- nothing has been seen so far, no event incomplete.protected static intSTATE_DTD_AFTER_DOCTYPEprotected static intSTATE_DTD_AFTER_PUBLICprotected static intSTATE_DTD_AFTER_PUBLIC_IDprotected static intSTATE_DTD_AFTER_ROOT_NAMEprotected static intSTATE_DTD_AFTER_SYSTEMprotected static intSTATE_DTD_AFTER_SYSTEM_IDprotected static intSTATE_DTD_BEFORE_IDSprotected static intSTATE_DTD_BEFORE_PUBLIC_IDprotected static intSTATE_DTD_BEFORE_ROOT_NAMEprotected static intSTATE_DTD_BEFORE_SYSTEM_IDprotected static intSTATE_DTD_DOCTYPEprotected static intSTATE_DTD_EXPECT_CLOSING_GTprotected static intSTATE_DTD_INT_SUBSETprotected static intSTATE_DTD_PUBLIC_IDprotected static intSTATE_DTD_PUBLIC_OR_SYSTEMprotected static intSTATE_DTD_ROOT_NAMEprotected static intSTATE_DTD_SYSTEM_IDprotected static intSTATE_EE_NEED_GTprotected static intSTATE_PI_AFTER_TARGETprotected static intSTATE_PI_AFTER_TARGET_QMARKprotected static intSTATE_PI_AFTER_TARGET_WSprotected static intSTATE_PI_IN_DATAprotected static intSTATE_PI_IN_TARGETprotected static intSTATE_PROLOG_DECLprotected static intSTATE_PROLOG_INITIALState in which a less-than sign has been seenprotected static intSTATE_PROLOG_SEEN_LTprotected static intSTATE_SE_ATTR_NAMEprotected static intSTATE_SE_ATTR_VALUE_NORMALprotected static intSTATE_SE_ATTR_VALUE_NSDECLprotected static intSTATE_SE_ELEM_NAMEprotected static intSTATE_SE_SEEN_SLASHprotected static intSTATE_SE_SPACE_OR_ATTRNAMEprotected static intSTATE_SE_SPACE_OR_ATTRVALUEprotected static intSTATE_SE_SPACE_OR_ENDprotected static intSTATE_SE_SPACE_OR_EQprotected static intSTATE_TEXT_AMPprotected static intSTATE_TEXT_AMP_NAMEprotected static intSTATE_TREE_NAMED_ENTITY_STARTprotected static intSTATE_TREE_NUMERIC_ENTITY_STARTprotected static intSTATE_TREE_SEEN_AMPprotected static intSTATE_TREE_SEEN_EXCLprotected static intSTATE_TREE_SEEN_LTprotected static intSTATE_TREE_SEEN_SLASHprotected static intSTATE_XMLDECL_AFTER_ENCODINGprotected static intSTATE_XMLDECL_AFTER_ENCODING_VALUEprotected static intSTATE_XMLDECL_AFTER_STANDALONEprotected static intSTATE_XMLDECL_AFTER_STANDALONE_VALUEprotected static intSTATE_XMLDECL_AFTER_VERSIONprotected static intSTATE_XMLDECL_AFTER_VERSION_VALUEprotected static intSTATE_XMLDECL_AFTER_XMLprotected static intSTATE_XMLDECL_BEFORE_ENCODINGprotected static intSTATE_XMLDECL_BEFORE_STANDALONEprotected static intSTATE_XMLDECL_BEFORE_VERSIONprotected static intSTATE_XMLDECL_ENCODINGprotected static intSTATE_XMLDECL_ENCODING_EQprotected static intSTATE_XMLDECL_ENCODING_VALUEprotected static intSTATE_XMLDECL_ENDQprotected static intSTATE_XMLDECL_STANDALONEprotected static intSTATE_XMLDECL_STANDALONE_EQprotected static intSTATE_XMLDECL_STANDALONE_VALUEprotected static intSTATE_XMLDECL_VERSIONprotected static intSTATE_XMLDECL_VERSION_EQprotected static intSTATE_XMLDECL_VERSION_VALUE-
Fields inherited from class com.fasterxml.aalto.in.ByteBasedScanner
_inputEnd, _inputPtr, _tmpChar, BYTE_a, BYTE_A, BYTE_AMP, BYTE_APOS, BYTE_C, BYTE_CR, BYTE_D, BYTE_EQ, BYTE_EXCL, BYTE_g, BYTE_GT, BYTE_HASH, BYTE_HYPHEN, BYTE_l, BYTE_LBRACKET, BYTE_LF, BYTE_LT, BYTE_m, BYTE_NULL, BYTE_o, BYTE_p, BYTE_P, BYTE_q, BYTE_QMARK, BYTE_QUOT, BYTE_RBRACKET, BYTE_s, BYTE_S, BYTE_SEMICOLON, BYTE_SLASH, BYTE_SPACE, BYTE_t, BYTE_T, BYTE_TAB, BYTE_u, BYTE_x
-
Fields inherited from class com.fasterxml.aalto.in.XmlScanner
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _pastBytesOrChars, _publicId, _rowStartOffset, _startColumn, _startRawOffset, _startRow, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOI
-
Fields inherited from interface com.fasterxml.aalto.util.XmlConsts
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN
-
-
Constructor Summary
Constructors Modifier Constructor Description protectedAsyncByteScanner(ReaderConfig cfg)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected void_activateEncoding()Initialization method to call when encoding has been definitely figured out, from XML declarations, or, from lack of one (using defaults).protected void_closeSource()Since the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close.protected abstract byte_currentByte()protected PName_findXmlDeclName(int lastQuad, int lastByteCount)protected abstract byte_nextByte()private PName_parseNewXmlDeclName(byte b)private PName_parseXmlDeclName()protected abstract byte_prevByte()protected void_releaseBuffers()protected int_startDocumentNoXmlDecl()Helper method called when it is determined that the document does NOT start with an xml declaration.protected PNameaddPName(ByteBasedPNameTable symbols, int hash, int[] quads, int qlen, int lastQuadBytes)protected abstract booleanasyncSkipSpace()protected voidcheckPITargetName(PName targetName)protected intdecodeCharForError(byte b)Method called by methods when encountering a byte that can not be part of a valid character in the current context.voidendOfInput()Method that should be called after last chunk of data to parse has been fed.protected PNamefindPName(int lastQuad, int lastByteCount)Method called to process a sequence of bytes that is likely to be a PName.protected voidfinishCData()protected abstract voidfinishCharacters()protected voidfinishComment()protected voidfinishDTD(boolean copyContents)protected voidfinishPI()protected voidfinishSpace()protected voidfinishToken()This method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)protected abstract booleanhandleAttrValue()protected abstract inthandleComment()private inthandleDTD()protected abstract booleanhandleDTDInternalSubset(boolean init)protected abstract booleanhandleNsDecl()protected abstract booleanhandlePartialCR()protected abstract inthandlePI()private inthandlePrologDeclStart(boolean isProlog)protected abstract inthandleStartElement()protected abstract inthandleStartElementStart(byte b)private inthandleXmlDeclaration()Method called to complete parsing of XML declaration, once it has been reliably detected.protected booleanloadMore()intnextFromProlog(boolean isProlog)private booleanparseDtdId(char[] outputBuffer, int outputPtr, boolean system)protected abstract PNameparseNewName(byte b)protected abstract PNameparsePName()protected booleanparseXmlDeclAttr(char[] outputBuffer, int outputPtr)Method called to try to parse an XML pseudo-attribute value.protected voidreportInvalidOther(int mask, int ptr)protected voidskipCData()protected abstract booleanskipCharacters()protected voidskipComment()protected voidskipPI()protected voidskipSpace()protected abstract intstartCharacters(byte b)Method called to initialize state for CHARACTERS event, after just a single byte has been seen.private java.lang.BooleanstartXmlDeclaration()Method that deals with recognizing XML declaration, but not with parsing its contents.protected intthrowInternal()protected booleanvalidPublicIdChar(int c)Checks that a character for a PublicIdprotected voidverifyAndAppendEntityCharacter(int charFromEntity)Method called to verify validity of given character (from entity) and append it to the text bufferprotected voidverifyAndSetPublicId()protected voidverifyAndSetSystemId()protected voidverifyAndSetXmlEncoding()protected voidverifyAndSetXmlStandalone()protected voidverifyAndSetXmlVersion()-
Methods inherited from class com.fasterxml.aalto.in.ByteBasedScanner
addUTFPName, getCurrentColumnNr, getCurrentLocation, getEndingByteOffset, getEndingCharOffset, getStartingByteOffset, getStartingCharOffset, markLF, markLF, reportInvalidInitial, reportInvalidOther, setStartLocation
-
Methods inherited from class com.fasterxml.aalto.in.XmlScanner
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, decodeElements, findAttrIndex, findOrCreateBinding, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getCurrentLineNr, getDepth, getDTDPublicId, getDTDSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, handleInvalidXmlChar, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMoreGuaranteed, loadMoreGuaranteed, nextFromTree, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologProblem, reportPrologUnexpChar, reportPrologUnexpElement, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, resetForDecoding, skipCoalescedText, skipToken, throwInvalidSpace, throwNullChar, throwUnexpectedChar, verifyXmlChar
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.fasterxml.aalto.AsyncInputFeeder
needMoreInput
-
-
-
-
Field Detail
-
EVENT_INCOMPLETE
protected static final int EVENT_INCOMPLETE
- See Also:
- Constant Field Values
-
STATE_DEFAULT
protected static final int STATE_DEFAULT
Default starting state for many events/contexts -- nothing has been seen so far, no event incomplete. Not used for all event types.- See Also:
- Constant Field Values
-
STATE_PROLOG_INITIAL
protected static final int STATE_PROLOG_INITIAL
State in which a less-than sign has been seen- See Also:
- Constant Field Values
-
STATE_PROLOG_SEEN_LT
protected static final int STATE_PROLOG_SEEN_LT
- See Also:
- Constant Field Values
-
STATE_PROLOG_DECL
protected static final int STATE_PROLOG_DECL
- See Also:
- Constant Field Values
-
STATE_TREE_SEEN_LT
protected static final int STATE_TREE_SEEN_LT
- See Also:
- Constant Field Values
-
STATE_TREE_SEEN_AMP
protected static final int STATE_TREE_SEEN_AMP
- See Also:
- Constant Field Values
-
STATE_TREE_SEEN_EXCL
protected static final int STATE_TREE_SEEN_EXCL
- See Also:
- Constant Field Values
-
STATE_TREE_SEEN_SLASH
protected static final int STATE_TREE_SEEN_SLASH
- See Also:
- Constant Field Values
-
STATE_TREE_NUMERIC_ENTITY_START
protected static final int STATE_TREE_NUMERIC_ENTITY_START
- See Also:
- Constant Field Values
-
STATE_TREE_NAMED_ENTITY_START
protected static final int STATE_TREE_NAMED_ENTITY_START
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_XML
protected static final int STATE_XMLDECL_AFTER_XML
- See Also:
- Constant Field Values
-
STATE_XMLDECL_BEFORE_VERSION
protected static final int STATE_XMLDECL_BEFORE_VERSION
- See Also:
- Constant Field Values
-
STATE_XMLDECL_VERSION
protected static final int STATE_XMLDECL_VERSION
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_VERSION
protected static final int STATE_XMLDECL_AFTER_VERSION
- See Also:
- Constant Field Values
-
STATE_XMLDECL_VERSION_EQ
protected static final int STATE_XMLDECL_VERSION_EQ
- See Also:
- Constant Field Values
-
STATE_XMLDECL_VERSION_VALUE
protected static final int STATE_XMLDECL_VERSION_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_VERSION_VALUE
protected static final int STATE_XMLDECL_AFTER_VERSION_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_BEFORE_ENCODING
protected static final int STATE_XMLDECL_BEFORE_ENCODING
- See Also:
- Constant Field Values
-
STATE_XMLDECL_ENCODING
protected static final int STATE_XMLDECL_ENCODING
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_ENCODING
protected static final int STATE_XMLDECL_AFTER_ENCODING
- See Also:
- Constant Field Values
-
STATE_XMLDECL_ENCODING_EQ
protected static final int STATE_XMLDECL_ENCODING_EQ
- See Also:
- Constant Field Values
-
STATE_XMLDECL_ENCODING_VALUE
protected static final int STATE_XMLDECL_ENCODING_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_ENCODING_VALUE
protected static final int STATE_XMLDECL_AFTER_ENCODING_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_BEFORE_STANDALONE
protected static final int STATE_XMLDECL_BEFORE_STANDALONE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_STANDALONE
protected static final int STATE_XMLDECL_STANDALONE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_STANDALONE
protected static final int STATE_XMLDECL_AFTER_STANDALONE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_STANDALONE_EQ
protected static final int STATE_XMLDECL_STANDALONE_EQ
- See Also:
- Constant Field Values
-
STATE_XMLDECL_STANDALONE_VALUE
protected static final int STATE_XMLDECL_STANDALONE_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_AFTER_STANDALONE_VALUE
protected static final int STATE_XMLDECL_AFTER_STANDALONE_VALUE
- See Also:
- Constant Field Values
-
STATE_XMLDECL_ENDQ
protected static final int STATE_XMLDECL_ENDQ
- See Also:
- Constant Field Values
-
STATE_DTD_DOCTYPE
protected static final int STATE_DTD_DOCTYPE
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_DOCTYPE
protected static final int STATE_DTD_AFTER_DOCTYPE
- See Also:
- Constant Field Values
-
STATE_DTD_BEFORE_ROOT_NAME
protected static final int STATE_DTD_BEFORE_ROOT_NAME
- See Also:
- Constant Field Values
-
STATE_DTD_ROOT_NAME
protected static final int STATE_DTD_ROOT_NAME
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_ROOT_NAME
protected static final int STATE_DTD_AFTER_ROOT_NAME
- See Also:
- Constant Field Values
-
STATE_DTD_BEFORE_IDS
protected static final int STATE_DTD_BEFORE_IDS
- See Also:
- Constant Field Values
-
STATE_DTD_PUBLIC_OR_SYSTEM
protected static final int STATE_DTD_PUBLIC_OR_SYSTEM
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_PUBLIC
protected static final int STATE_DTD_AFTER_PUBLIC
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_SYSTEM
protected static final int STATE_DTD_AFTER_SYSTEM
- See Also:
- Constant Field Values
-
STATE_DTD_BEFORE_PUBLIC_ID
protected static final int STATE_DTD_BEFORE_PUBLIC_ID
- See Also:
- Constant Field Values
-
STATE_DTD_PUBLIC_ID
protected static final int STATE_DTD_PUBLIC_ID
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_PUBLIC_ID
protected static final int STATE_DTD_AFTER_PUBLIC_ID
- See Also:
- Constant Field Values
-
STATE_DTD_BEFORE_SYSTEM_ID
protected static final int STATE_DTD_BEFORE_SYSTEM_ID
- See Also:
- Constant Field Values
-
STATE_DTD_SYSTEM_ID
protected static final int STATE_DTD_SYSTEM_ID
- See Also:
- Constant Field Values
-
STATE_DTD_AFTER_SYSTEM_ID
protected static final int STATE_DTD_AFTER_SYSTEM_ID
- See Also:
- Constant Field Values
-
STATE_DTD_INT_SUBSET
protected static final int STATE_DTD_INT_SUBSET
- See Also:
- Constant Field Values
-
STATE_DTD_EXPECT_CLOSING_GT
protected static final int STATE_DTD_EXPECT_CLOSING_GT
- See Also:
- Constant Field Values
-
STATE_TEXT_AMP
protected static final int STATE_TEXT_AMP
- See Also:
- Constant Field Values
-
STATE_TEXT_AMP_NAME
protected static final int STATE_TEXT_AMP_NAME
- See Also:
- Constant Field Values
-
STATE_COMMENT_CONTENT
protected static final int STATE_COMMENT_CONTENT
- See Also:
- Constant Field Values
-
STATE_COMMENT_HYPHEN
protected static final int STATE_COMMENT_HYPHEN
- See Also:
- Constant Field Values
-
STATE_COMMENT_HYPHEN2
protected static final int STATE_COMMENT_HYPHEN2
- See Also:
- Constant Field Values
-
STATE_CDATA_CONTENT
protected static final int STATE_CDATA_CONTENT
- See Also:
- Constant Field Values
-
STATE_CDATA_C
protected static final int STATE_CDATA_C
- See Also:
- Constant Field Values
-
STATE_CDATA_CD
protected static final int STATE_CDATA_CD
- See Also:
- Constant Field Values
-
STATE_CDATA_CDA
protected static final int STATE_CDATA_CDA
- See Also:
- Constant Field Values
-
STATE_CDATA_CDAT
protected static final int STATE_CDATA_CDAT
- See Also:
- Constant Field Values
-
STATE_CDATA_CDATA
protected static final int STATE_CDATA_CDATA
- See Also:
- Constant Field Values
-
STATE_PI_AFTER_TARGET
protected static final int STATE_PI_AFTER_TARGET
- See Also:
- Constant Field Values
-
STATE_PI_AFTER_TARGET_WS
protected static final int STATE_PI_AFTER_TARGET_WS
- See Also:
- Constant Field Values
-
STATE_PI_AFTER_TARGET_QMARK
protected static final int STATE_PI_AFTER_TARGET_QMARK
- See Also:
- Constant Field Values
-
STATE_PI_IN_TARGET
protected static final int STATE_PI_IN_TARGET
- See Also:
- Constant Field Values
-
STATE_PI_IN_DATA
protected static final int STATE_PI_IN_DATA
- See Also:
- Constant Field Values
-
STATE_SE_ELEM_NAME
protected static final int STATE_SE_ELEM_NAME
- See Also:
- Constant Field Values
-
STATE_SE_SPACE_OR_END
protected static final int STATE_SE_SPACE_OR_END
- See Also:
- Constant Field Values
-
STATE_SE_SPACE_OR_ATTRNAME
protected static final int STATE_SE_SPACE_OR_ATTRNAME
- See Also:
- Constant Field Values
-
STATE_SE_ATTR_NAME
protected static final int STATE_SE_ATTR_NAME
- See Also:
- Constant Field Values
-
STATE_SE_SPACE_OR_EQ
protected static final int STATE_SE_SPACE_OR_EQ
- See Also:
- Constant Field Values
-
STATE_SE_SPACE_OR_ATTRVALUE
protected static final int STATE_SE_SPACE_OR_ATTRVALUE
- See Also:
- Constant Field Values
-
STATE_SE_ATTR_VALUE_NORMAL
protected static final int STATE_SE_ATTR_VALUE_NORMAL
- See Also:
- Constant Field Values
-
STATE_SE_ATTR_VALUE_NSDECL
protected static final int STATE_SE_ATTR_VALUE_NSDECL
- See Also:
- Constant Field Values
-
STATE_SE_SEEN_SLASH
protected static final int STATE_SE_SEEN_SLASH
- See Also:
- Constant Field Values
-
STATE_EE_NEED_GT
protected static final int STATE_EE_NEED_GT
- See Also:
- Constant Field Values
-
PENDING_STATE_CR
protected static final int PENDING_STATE_CR
- See Also:
- Constant Field Values
-
PENDING_STATE_XMLDECL_LT
protected static final int PENDING_STATE_XMLDECL_LT
- See Also:
- Constant Field Values
-
PENDING_STATE_XMLDECL_LTQ
protected static final int PENDING_STATE_XMLDECL_LTQ
- See Also:
- Constant Field Values
-
PENDING_STATE_XMLDECL_TARGET
protected static final int PENDING_STATE_XMLDECL_TARGET
- See Also:
- Constant Field Values
-
PENDING_STATE_PI_QMARK
protected static final int PENDING_STATE_PI_QMARK
- See Also:
- Constant Field Values
-
PENDING_STATE_COMMENT_HYPHEN1
protected static final int PENDING_STATE_COMMENT_HYPHEN1
- See Also:
- Constant Field Values
-
PENDING_STATE_COMMENT_HYPHEN2
protected static final int PENDING_STATE_COMMENT_HYPHEN2
- See Also:
- Constant Field Values
-
PENDING_STATE_CDATA_BRACKET1
protected static final int PENDING_STATE_CDATA_BRACKET1
- See Also:
- Constant Field Values
-
PENDING_STATE_CDATA_BRACKET2
protected static final int PENDING_STATE_CDATA_BRACKET2
- See Also:
- Constant Field Values
-
PENDING_STATE_ENT_SEEN_HASH
protected static final int PENDING_STATE_ENT_SEEN_HASH
- See Also:
- Constant Field Values
-
PENDING_STATE_ENT_SEEN_HASH_X
protected static final int PENDING_STATE_ENT_SEEN_HASH_X
- See Also:
- Constant Field Values
-
PENDING_STATE_ENT_IN_DEC_DIGIT
protected static final int PENDING_STATE_ENT_IN_DEC_DIGIT
- See Also:
- Constant Field Values
-
PENDING_STATE_ENT_IN_HEX_DIGIT
protected static final int PENDING_STATE_ENT_IN_HEX_DIGIT
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_AMP
protected static final int PENDING_STATE_ATTR_VALUE_AMP
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_AMP_HASH
protected static final int PENDING_STATE_ATTR_VALUE_AMP_HASH
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_AMP_HASH_X
protected static final int PENDING_STATE_ATTR_VALUE_AMP_HASH_X
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_ENTITY_NAME
protected static final int PENDING_STATE_ATTR_VALUE_ENTITY_NAME
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_DEC_DIGIT
protected static final int PENDING_STATE_ATTR_VALUE_DEC_DIGIT
- See Also:
- Constant Field Values
-
PENDING_STATE_ATTR_VALUE_HEX_DIGIT
protected static final int PENDING_STATE_ATTR_VALUE_HEX_DIGIT
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_AMP
protected static final int PENDING_STATE_TEXT_AMP
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_AMP_HASH
protected static final int PENDING_STATE_TEXT_AMP_HASH
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_DEC_ENTITY
protected static final int PENDING_STATE_TEXT_DEC_ENTITY
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_HEX_ENTITY
protected static final int PENDING_STATE_TEXT_HEX_ENTITY
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_IN_ENTITY
protected static final int PENDING_STATE_TEXT_IN_ENTITY
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_BRACKET1
protected static final int PENDING_STATE_TEXT_BRACKET1
- See Also:
- Constant Field Values
-
PENDING_STATE_TEXT_BRACKET2
protected static final int PENDING_STATE_TEXT_BRACKET2
- See Also:
- Constant Field Values
-
_charTypes
protected XmlCharTypes _charTypes
This is a simple container object that is used to access the decoding tables for characters. Indirection is needed since we actually support multiple utf-8 compatible encodings, not just utf-8 itself.NOTE: non-final due to xml declaration handling occurring later.
-
_symbols
protected ByteBasedPNameTable _symbols
For now, symbol table contains prefixed names. In future it is possible that they may be split into prefixes and local names?NOTE: non-final for async scanners
-
_quadBuffer
protected int[] _quadBuffer
This buffer is used for name parsing. Will be expanded if/as needed; 32 ints can hold names 128 ascii chars long.
-
_nextEvent
protected int _nextEvent
Due to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete. Type of that event is stored here.
-
_state
protected int _state
In addition to the event type, there is need for additional state information
-
_surroundingEvent
protected int _surroundingEvent
For token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained.
-
_pendingInput
protected int _pendingInput
There are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs. Since they can be split across input buffer boundaries, first byte(s) may need to be temporarily stored.If so, this int will store byte(s), in little-endian format (that is, first pending byte is at 0x000000FF, second [if any] at 0x0000FF00, and third at 0x00FF0000). This can be (and is) used to figure out actual number of bytes pending, for multi-byte (UTF-8) character decoding.
Note: it is assumed that if value is 0, there is no data. Thus, if 0 needed to be added pending, it has to be masked.
-
_endOfInput
protected boolean _endOfInput
Flag that is sent when calling application indicates that there will be no more input to parse.
-
_quadCount
protected int _quadCount
Number of complete quads parsed for current name (quads themselves are stored in_quadBuffer).
-
_currQuad
protected int _currQuad
Bytes parsed for the current, incomplete, quad
-
_currQuadBytes
protected int _currQuadBytes
Number of bytes pending/buffered, stored in_currQuad
-
_entityValue
protected int _entityValue
Entity value accumulated so far
-
_elemAllNsBound
protected boolean _elemAllNsBound
-
_elemAttrCount
protected boolean _elemAttrCount
-
_elemAttrQuote
protected byte _elemAttrQuote
-
_elemAttrName
protected PName _elemAttrName
-
_elemAttrPtr
protected int _elemAttrPtr
Pointer for the next character of currently being parsed value within attribute value buffer
-
_elemNsPtr
protected int _elemNsPtr
Pointer for the next character of currently being parsed namespace URI for the current namespace declaration
-
_inDtdDeclaration
protected boolean _inDtdDeclaration
Flag that indicates whether we are inside a declaration during parsing of internal DTD subset.
-
-
Constructor Detail
-
AsyncByteScanner
protected AsyncByteScanner(ReaderConfig cfg)
-
-
Method Detail
-
_activateEncoding
protected void _activateEncoding()
Initialization method to call when encoding has been definitely figured out, from XML declarations, or, from lack of one (using defaults).- Since:
- 1.1.1
-
endOfInput
public void endOfInput()
Description copied from interface:AsyncInputFeederMethod that should be called after last chunk of data to parse has been fed. May be called regardless of whatAsyncInputFeeder.needMoreInput()returns. After calling this method, no more data can be fed; and parser assumes no more data will be available.- Specified by:
endOfInputin interfaceAsyncInputFeeder
-
_releaseBuffers
protected void _releaseBuffers()
- Overrides:
_releaseBuffersin classXmlScanner
-
_closeSource
protected void _closeSource() throws java.io.IOExceptionSince the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close. But we can at least mark input as having ended.- Specified by:
_closeSourcein classByteBasedScanner- Throws:
java.io.IOException
-
verifyAndSetXmlVersion
protected void verifyAndSetXmlVersion() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndSetXmlEncoding
protected void verifyAndSetXmlEncoding() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndSetXmlStandalone
protected void verifyAndSetXmlStandalone() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndSetPublicId
protected void verifyAndSetPublicId() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndSetSystemId
protected void verifyAndSetSystemId() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
_currentByte
protected abstract byte _currentByte() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
_nextByte
protected abstract byte _nextByte() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
_prevByte
protected abstract byte _prevByte() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
handlePI
protected abstract int handlePI() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
handleDTDInternalSubset
protected abstract boolean handleDTDInternalSubset(boolean init) throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
handleComment
protected abstract int handleComment() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
handleStartElementStart
protected abstract int handleStartElementStart(byte b) throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
handleStartElement
protected abstract int handleStartElement() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
parsePName
protected abstract PName parsePName() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
parseNewName
protected abstract PName parseNewName(byte b) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
asyncSkipSpace
protected abstract boolean asyncSkipSpace() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
handlePartialCR
protected abstract boolean handlePartialCR() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
finishToken
protected final void finishToken() throws javax.xml.stream.XMLStreamExceptionDescription copied from class:XmlScannerThis method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)- Specified by:
finishTokenin classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
startCharacters
protected abstract int startCharacters(byte b) throws javax.xml.stream.XMLStreamExceptionMethod called to initialize state for CHARACTERS event, after just a single byte has been seen. What needs to be done next depends on whether coalescing mode is set or not: if it is not set, just a single character needs to be decoded, after which current event will be incomplete, but defined as CHARACTERS. In coalescing mode, the whole content must be read before current event can be defined. The reason for difference is that whenXMLStreamReader.next()returns, no blocking can occur when calling other methods.- Returns:
- Event type detected; either CHARACTERS, if at least one full character was decoded (and can be returned), EVENT_INCOMPLETE if not (part of a multi-byte character split across input buffer boundary)
- Throws:
javax.xml.stream.XMLStreamException
-
handleAttrValue
protected abstract boolean handleAttrValue() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
handleNsDecl
protected abstract boolean handleNsDecl() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
finishCData
protected void finishCData() throws javax.xml.stream.XMLStreamException- Specified by:
finishCDatain classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
finishComment
protected void finishComment() throws javax.xml.stream.XMLStreamException- Specified by:
finishCommentin classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
finishDTD
protected void finishDTD(boolean copyContents) throws javax.xml.stream.XMLStreamException- Specified by:
finishDTDin classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
finishPI
protected void finishPI() throws javax.xml.stream.XMLStreamException- Specified by:
finishPIin classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
finishSpace
protected void finishSpace() throws javax.xml.stream.XMLStreamException- Specified by:
finishSpacein classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
skipCharacters
protected abstract boolean skipCharacters() throws javax.xml.stream.XMLStreamException- Specified by:
skipCharactersin classXmlScanner- Returns:
- True if the whole characters segment was succesfully skipped; false if not
- Throws:
javax.xml.stream.XMLStreamException
-
skipCData
protected void skipCData() throws javax.xml.stream.XMLStreamException- Specified by:
skipCDatain classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
skipComment
protected void skipComment() throws javax.xml.stream.XMLStreamException- Specified by:
skipCommentin classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
skipPI
protected void skipPI() throws javax.xml.stream.XMLStreamException- Specified by:
skipPIin classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
skipSpace
protected void skipSpace() throws javax.xml.stream.XMLStreamException- Specified by:
skipSpacein classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
loadMore
protected boolean loadMore() throws javax.xml.stream.XMLStreamException- Specified by:
loadMorein classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
finishCharacters
protected abstract void finishCharacters() throws javax.xml.stream.XMLStreamException- Specified by:
finishCharactersin classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
findPName
protected final PName findPName(int lastQuad, int lastByteCount) throws javax.xml.stream.XMLStreamException
Method called to process a sequence of bytes that is likely to be a PName. At this point we encountered an end marker, and may either hit a formerly seen well-formed PName; an as-of-yet unseen well-formed PName; or a non-well-formed sequence (containing one or more non-name chars without any valid end markers).- Parameters:
lastQuad- Word with last 0 to 3 bytes of the PName; not included in the quad arraylastByteCount- Number of bytes contained in lastQuad; 0 to 3.- Throws:
javax.xml.stream.XMLStreamException
-
addPName
protected final PName addPName(ByteBasedPNameTable symbols, int hash, int[] quads, int qlen, int lastQuadBytes) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
verifyAndAppendEntityCharacter
protected void verifyAndAppendEntityCharacter(int charFromEntity) throws javax.xml.stream.XMLStreamExceptionMethod called to verify validity of given character (from entity) and append it to the text buffer- Throws:
javax.xml.stream.XMLStreamException
-
validPublicIdChar
protected boolean validPublicIdChar(int c)
Checks that a character for a PublicId- Parameters:
c- A character- Returns:
- true if the character is valid for use in the Public ID of an XML doctype declaration
- See Also:
- "http://www.w3.org/TR/xml/#NT-PubidLiteral"
-
decodeCharForError
protected int decodeCharForError(byte b) throws javax.xml.stream.XMLStreamExceptionDescription copied from class:ByteBasedScannerMethod called by methods when encountering a byte that can not be part of a valid character in the current context. Should return the actual decoded character for error reporting purposes.- Specified by:
decodeCharForErrorin classByteBasedScanner- Throws:
javax.xml.stream.XMLStreamException
-
checkPITargetName
protected void checkPITargetName(PName targetName) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
throwInternal
protected int throwInternal()
-
reportInvalidOther
protected void reportInvalidOther(int mask, int ptr) throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
nextFromProlog
public final int nextFromProlog(boolean isProlog) throws javax.xml.stream.XMLStreamException- Specified by:
nextFromPrologin classXmlScanner- Throws:
javax.xml.stream.XMLStreamException
-
_startDocumentNoXmlDecl
protected int _startDocumentNoXmlDecl() throws javax.xml.stream.XMLStreamExceptionHelper method called when it is determined that the document does NOT start with an xml declaration. Needs to return START_DOCUMENT, and initialize other state appropriately.- Throws:
javax.xml.stream.XMLStreamException
-
handlePrologDeclStart
private final int handlePrologDeclStart(boolean isProlog) throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
startXmlDeclaration
private final java.lang.Boolean startXmlDeclaration() throws javax.xml.stream.XMLStreamExceptionMethod that deals with recognizing XML declaration, but not with parsing its contents.- Returns:
- null if parsing is inconclusive (may or may not be XML declaration); Boolean.TRUE if complete XML declaration, and Boolean.FALSE if something else
- Throws:
javax.xml.stream.XMLStreamException
-
handleXmlDeclaration
private int handleXmlDeclaration() throws javax.xml.stream.XMLStreamExceptionMethod called to complete parsing of XML declaration, once it has been reliably detected.- Returns:
- Completed token (START_DOCUMENT), if fully parsed; incomplete (EVENT_INCOMPLETE) otherwise
- Throws:
javax.xml.stream.XMLStreamException
-
handleDTD
private int handleDTD() throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
parseDtdId
private final boolean parseDtdId(char[] outputBuffer, int outputPtr, boolean system) throws javax.xml.stream.XMLStreamException- Throws:
javax.xml.stream.XMLStreamException
-
_parseNewXmlDeclName
private final PName _parseNewXmlDeclName(byte b) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
_parseXmlDeclName
private final PName _parseXmlDeclName() throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
_findXmlDeclName
protected final PName _findXmlDeclName(int lastQuad, int lastByteCount) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
parseXmlDeclAttr
protected boolean parseXmlDeclAttr(char[] outputBuffer, int outputPtr) throws javax.xml.stream.XMLStreamExceptionMethod called to try to parse an XML pseudo-attribute value. This is relatively simple, since we can't have linefeeds or entities; and although there are exact rules for what is allowed, we can do coarse parsing and only later on verify validity (for encoding could do stricter parsing in future?)NOTE: pseudo-attribute values required to be 7-bit ASCII so can do crude cast.
- Returns:
- True if we managed to parse the whole pseudo-attribute
- Throws:
javax.xml.stream.XMLStreamException
-
-