Class AsyncByteScanner
java.lang.Object
com.fasterxml.aalto.in.XmlScanner
com.fasterxml.aalto.in.ByteBasedScanner
com.fasterxml.aalto.async.AsyncByteScanner
- All Implemented Interfaces:
AsyncInputFeeder, XmlConsts, NamespaceContext, XMLStreamConstants
- Direct Known Subclasses:
AsyncByteArrayScanner, AsyncByteBufferScanner
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected XmlCharTypesThis is a simple container object that is used to access the decoding tables for characters.protected intBytes parsed for the current, incomplete, quadprotected intNumber of bytes pending/buffered, stored in_currQuadprotected booleanprotected booleanprotected PNameprotected intPointer for the next character of currently being parsed value within attribute value bufferprotected byteprotected intPointer for the next character of currently being parsed namespace URI for the current namespace declarationprotected booleanFlag that is sent when calling application indicates that there will be no more input to parse.protected intEntity value accumulated so farprotected booleanFlag that indicates whether we are inside a declaration during parsing of internal DTD subset.protected intDue to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete.protected intThere are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs.protected int[]This buffer is used for name parsing.protected intNumber of complete quads parsed for current name (quads themselves are stored in_quadBuffer).protected intIn addition to the event type, there is need for additional state informationprotected intFor token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained.protected ByteBasedPNameTableFor now, symbol table contains prefixed names.protected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intDefault starting state for many events/contexts -- nothing has been seen so far, no event incomplete.protected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intState in which a less-than sign has been seenprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intprotected static final intFields inherited from class ByteBasedScanner
_inputEnd, _inputPtr, _tmpChar, BYTE_a, BYTE_A, BYTE_AMP, BYTE_APOS, BYTE_C, BYTE_CR, BYTE_D, BYTE_EQ, BYTE_EXCL, BYTE_g, BYTE_GT, BYTE_HASH, BYTE_HYPHEN, BYTE_l, BYTE_LBRACKET, BYTE_LF, BYTE_LT, BYTE_m, BYTE_NULL, BYTE_o, BYTE_p, BYTE_P, BYTE_q, BYTE_QMARK, BYTE_QUOT, BYTE_RBRACKET, BYTE_s, BYTE_S, BYTE_SEMICOLON, BYTE_SLASH, BYTE_SPACE, BYTE_t, BYTE_T, BYTE_TAB, BYTE_u, BYTE_xFields inherited from class XmlScanner
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _pastBytesOrChars, _publicId, _rowStartOffset, _startColumn, _startRawOffset, _startRow, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOIFields inherited from interface XmlConsts
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWNFields inherited from interface XMLStreamConstants
ATTRIBUTE, CDATA, CHARACTERS, COMMENT, DTD, END_DOCUMENT, END_ELEMENT, ENTITY_DECLARATION, ENTITY_REFERENCE, NAMESPACE, NOTATION_DECLARATION, PROCESSING_INSTRUCTION, SPACE, START_DOCUMENT, START_ELEMENT -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidInitialization method to call when encoding has been definitely figured out, from XML declarations, or, from lack of one (using defaults).protected voidSince the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close.protected abstract byteprotected final PName_findXmlDeclName(int lastQuad, int lastByteCount) protected abstract byteprivate final PName_parseNewXmlDeclName(byte b) private final PNameprotected abstract byteprotected voidprotected intHelper method called when it is determined that the document does NOT start with an xml declaration.protected final PNameaddPName(ByteBasedPNameTable symbols, int hash, int[] quads, int qlen, int lastQuadBytes) protected abstract booleanprotected voidcheckPITargetName(PName targetName) protected intdecodeCharForError(byte b) Method called by methods when encountering a byte that can not be part of a valid character in the current context.voidMethod that should be called after last chunk of data to parse has been fed.protected final PNamefindPName(int lastQuad, int lastByteCount) Method called to process a sequence of bytes that is likely to be a PName.protected voidprotected abstract voidprotected voidprotected voidfinishDTD(boolean copyContents) protected voidfinishPI()protected voidprotected final voidThis method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)protected abstract booleanprotected abstract intprivate intprotected abstract booleanhandleDTDInternalSubset(boolean init) protected abstract booleanprotected abstract booleanprotected abstract inthandlePI()private final inthandlePrologDeclStart(boolean isProlog) protected abstract intprotected abstract inthandleStartElementStart(byte b) private intMethod called to complete parsing of XML declaration, once it has been reliably detected.protected booleanloadMore()final intnextFromProlog(boolean isProlog) private final booleanparseDtdId(char[] outputBuffer, int outputPtr, boolean system) protected abstract PNameparseNewName(byte b) protected abstract PNameprotected booleanparseXmlDeclAttr(char[] outputBuffer, int outputPtr) Method called to try to parse an XML pseudo-attribute value.protected voidreportInvalidOther(int mask, int ptr) protected voidprotected abstract booleanprotected voidprotected voidskipPI()protected voidprotected abstract intstartCharacters(byte b) Method called to initialize state for CHARACTERS event, after just a single byte has been seen.private final BooleanMethod that deals with recognizing XML declaration, but not with parsing its contents.protected intprotected booleanvalidPublicIdChar(int c) Checks that a character for a PublicIdprotected voidverifyAndAppendEntityCharacter(int charFromEntity) Method called to verify validity of given character (from entity) and append it to the text bufferprotected voidprotected voidprotected voidprotected voidprotected voidMethods inherited from class ByteBasedScanner
addUTFPName, getCurrentColumnNr, getCurrentLocation, getEndingByteOffset, getEndingCharOffset, getStartingByteOffset, getStartingCharOffset, markLF, markLF, reportInvalidInitial, reportInvalidOther, setStartLocationMethods inherited from class XmlScanner
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, decodeElements, findAttrIndex, findOrCreateBinding, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getCurrentLineNr, getDepth, getDTDPublicId, getDTDSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, handleInvalidXmlChar, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMoreGuaranteed, loadMoreGuaranteed, nextFromTree, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologProblem, reportPrologUnexpChar, reportPrologUnexpElement, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, resetForDecoding, skipCoalescedText, skipToken, throwInvalidSpace, throwNullChar, throwUnexpectedChar, verifyXmlCharMethods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface AsyncInputFeeder
needMoreInput
-
Field Details
-
EVENT_INCOMPLETE
protected static final int EVENT_INCOMPLETE- See Also:
-
STATE_DEFAULT
protected static final int STATE_DEFAULTDefault starting state for many events/contexts -- nothing has been seen so far, no event incomplete. Not used for all event types.- See Also:
-
STATE_PROLOG_INITIAL
protected static final int STATE_PROLOG_INITIALState in which a less-than sign has been seen- See Also:
-
STATE_PROLOG_SEEN_LT
protected static final int STATE_PROLOG_SEEN_LT- See Also:
-
STATE_PROLOG_DECL
protected static final int STATE_PROLOG_DECL- See Also:
-
STATE_TREE_SEEN_LT
protected static final int STATE_TREE_SEEN_LT- See Also:
-
STATE_TREE_SEEN_AMP
protected static final int STATE_TREE_SEEN_AMP- See Also:
-
STATE_TREE_SEEN_EXCL
protected static final int STATE_TREE_SEEN_EXCL- See Also:
-
STATE_TREE_SEEN_SLASH
protected static final int STATE_TREE_SEEN_SLASH- See Also:
-
STATE_TREE_NUMERIC_ENTITY_START
protected static final int STATE_TREE_NUMERIC_ENTITY_START- See Also:
-
STATE_TREE_NAMED_ENTITY_START
protected static final int STATE_TREE_NAMED_ENTITY_START- See Also:
-
STATE_XMLDECL_AFTER_XML
protected static final int STATE_XMLDECL_AFTER_XML- See Also:
-
STATE_XMLDECL_BEFORE_VERSION
protected static final int STATE_XMLDECL_BEFORE_VERSION- See Also:
-
STATE_XMLDECL_VERSION
protected static final int STATE_XMLDECL_VERSION- See Also:
-
STATE_XMLDECL_AFTER_VERSION
protected static final int STATE_XMLDECL_AFTER_VERSION- See Also:
-
STATE_XMLDECL_VERSION_EQ
protected static final int STATE_XMLDECL_VERSION_EQ- See Also:
-
STATE_XMLDECL_VERSION_VALUE
protected static final int STATE_XMLDECL_VERSION_VALUE- See Also:
-
STATE_XMLDECL_AFTER_VERSION_VALUE
protected static final int STATE_XMLDECL_AFTER_VERSION_VALUE- See Also:
-
STATE_XMLDECL_BEFORE_ENCODING
protected static final int STATE_XMLDECL_BEFORE_ENCODING- See Also:
-
STATE_XMLDECL_ENCODING
protected static final int STATE_XMLDECL_ENCODING- See Also:
-
STATE_XMLDECL_AFTER_ENCODING
protected static final int STATE_XMLDECL_AFTER_ENCODING- See Also:
-
STATE_XMLDECL_ENCODING_EQ
protected static final int STATE_XMLDECL_ENCODING_EQ- See Also:
-
STATE_XMLDECL_ENCODING_VALUE
protected static final int STATE_XMLDECL_ENCODING_VALUE- See Also:
-
STATE_XMLDECL_AFTER_ENCODING_VALUE
protected static final int STATE_XMLDECL_AFTER_ENCODING_VALUE- See Also:
-
STATE_XMLDECL_BEFORE_STANDALONE
protected static final int STATE_XMLDECL_BEFORE_STANDALONE- See Also:
-
STATE_XMLDECL_STANDALONE
protected static final int STATE_XMLDECL_STANDALONE- See Also:
-
STATE_XMLDECL_AFTER_STANDALONE
protected static final int STATE_XMLDECL_AFTER_STANDALONE- See Also:
-
STATE_XMLDECL_STANDALONE_EQ
protected static final int STATE_XMLDECL_STANDALONE_EQ- See Also:
-
STATE_XMLDECL_STANDALONE_VALUE
protected static final int STATE_XMLDECL_STANDALONE_VALUE- See Also:
-
STATE_XMLDECL_AFTER_STANDALONE_VALUE
protected static final int STATE_XMLDECL_AFTER_STANDALONE_VALUE- See Also:
-
STATE_XMLDECL_ENDQ
protected static final int STATE_XMLDECL_ENDQ- See Also:
-
STATE_DTD_DOCTYPE
protected static final int STATE_DTD_DOCTYPE- See Also:
-
STATE_DTD_AFTER_DOCTYPE
protected static final int STATE_DTD_AFTER_DOCTYPE- See Also:
-
STATE_DTD_BEFORE_ROOT_NAME
protected static final int STATE_DTD_BEFORE_ROOT_NAME- See Also:
-
STATE_DTD_ROOT_NAME
protected static final int STATE_DTD_ROOT_NAME- See Also:
-
STATE_DTD_AFTER_ROOT_NAME
protected static final int STATE_DTD_AFTER_ROOT_NAME- See Also:
-
STATE_DTD_BEFORE_IDS
protected static final int STATE_DTD_BEFORE_IDS- See Also:
-
STATE_DTD_PUBLIC_OR_SYSTEM
protected static final int STATE_DTD_PUBLIC_OR_SYSTEM- See Also:
-
STATE_DTD_AFTER_PUBLIC
protected static final int STATE_DTD_AFTER_PUBLIC- See Also:
-
STATE_DTD_AFTER_SYSTEM
protected static final int STATE_DTD_AFTER_SYSTEM- See Also:
-
STATE_DTD_BEFORE_PUBLIC_ID
protected static final int STATE_DTD_BEFORE_PUBLIC_ID- See Also:
-
STATE_DTD_PUBLIC_ID
protected static final int STATE_DTD_PUBLIC_ID- See Also:
-
STATE_DTD_AFTER_PUBLIC_ID
protected static final int STATE_DTD_AFTER_PUBLIC_ID- See Also:
-
STATE_DTD_BEFORE_SYSTEM_ID
protected static final int STATE_DTD_BEFORE_SYSTEM_ID- See Also:
-
STATE_DTD_SYSTEM_ID
protected static final int STATE_DTD_SYSTEM_ID- See Also:
-
STATE_DTD_AFTER_SYSTEM_ID
protected static final int STATE_DTD_AFTER_SYSTEM_ID- See Also:
-
STATE_DTD_INT_SUBSET
protected static final int STATE_DTD_INT_SUBSET- See Also:
-
STATE_DTD_EXPECT_CLOSING_GT
protected static final int STATE_DTD_EXPECT_CLOSING_GT- See Also:
-
STATE_TEXT_AMP
protected static final int STATE_TEXT_AMP- See Also:
-
STATE_TEXT_AMP_NAME
protected static final int STATE_TEXT_AMP_NAME- See Also:
-
STATE_COMMENT_CONTENT
protected static final int STATE_COMMENT_CONTENT- See Also:
-
STATE_COMMENT_HYPHEN
protected static final int STATE_COMMENT_HYPHEN- See Also:
-
STATE_COMMENT_HYPHEN2
protected static final int STATE_COMMENT_HYPHEN2- See Also:
-
STATE_CDATA_CONTENT
protected static final int STATE_CDATA_CONTENT- See Also:
-
STATE_CDATA_C
protected static final int STATE_CDATA_C- See Also:
-
STATE_CDATA_CD
protected static final int STATE_CDATA_CD- See Also:
-
STATE_CDATA_CDA
protected static final int STATE_CDATA_CDA- See Also:
-
STATE_CDATA_CDAT
protected static final int STATE_CDATA_CDAT- See Also:
-
STATE_CDATA_CDATA
protected static final int STATE_CDATA_CDATA- See Also:
-
STATE_PI_AFTER_TARGET
protected static final int STATE_PI_AFTER_TARGET- See Also:
-
STATE_PI_AFTER_TARGET_WS
protected static final int STATE_PI_AFTER_TARGET_WS- See Also:
-
STATE_PI_AFTER_TARGET_QMARK
protected static final int STATE_PI_AFTER_TARGET_QMARK- See Also:
-
STATE_PI_IN_TARGET
protected static final int STATE_PI_IN_TARGET- See Also:
-
STATE_PI_IN_DATA
protected static final int STATE_PI_IN_DATA- See Also:
-
STATE_SE_ELEM_NAME
protected static final int STATE_SE_ELEM_NAME- See Also:
-
STATE_SE_SPACE_OR_END
protected static final int STATE_SE_SPACE_OR_END- See Also:
-
STATE_SE_SPACE_OR_ATTRNAME
protected static final int STATE_SE_SPACE_OR_ATTRNAME- See Also:
-
STATE_SE_ATTR_NAME
protected static final int STATE_SE_ATTR_NAME- See Also:
-
STATE_SE_SPACE_OR_EQ
protected static final int STATE_SE_SPACE_OR_EQ- See Also:
-
STATE_SE_SPACE_OR_ATTRVALUE
protected static final int STATE_SE_SPACE_OR_ATTRVALUE- See Also:
-
STATE_SE_ATTR_VALUE_NORMAL
protected static final int STATE_SE_ATTR_VALUE_NORMAL- See Also:
-
STATE_SE_ATTR_VALUE_NSDECL
protected static final int STATE_SE_ATTR_VALUE_NSDECL- See Also:
-
STATE_SE_SEEN_SLASH
protected static final int STATE_SE_SEEN_SLASH- See Also:
-
STATE_EE_NEED_GT
protected static final int STATE_EE_NEED_GT- See Also:
-
PENDING_STATE_CR
protected static final int PENDING_STATE_CR- See Also:
-
PENDING_STATE_XMLDECL_LT
protected static final int PENDING_STATE_XMLDECL_LT- See Also:
-
PENDING_STATE_XMLDECL_LTQ
protected static final int PENDING_STATE_XMLDECL_LTQ- See Also:
-
PENDING_STATE_XMLDECL_TARGET
protected static final int PENDING_STATE_XMLDECL_TARGET- See Also:
-
PENDING_STATE_PI_QMARK
protected static final int PENDING_STATE_PI_QMARK- See Also:
-
PENDING_STATE_COMMENT_HYPHEN1
protected static final int PENDING_STATE_COMMENT_HYPHEN1- See Also:
-
PENDING_STATE_COMMENT_HYPHEN2
protected static final int PENDING_STATE_COMMENT_HYPHEN2- See Also:
-
PENDING_STATE_CDATA_BRACKET1
protected static final int PENDING_STATE_CDATA_BRACKET1- See Also:
-
PENDING_STATE_CDATA_BRACKET2
protected static final int PENDING_STATE_CDATA_BRACKET2- See Also:
-
PENDING_STATE_ENT_SEEN_HASH
protected static final int PENDING_STATE_ENT_SEEN_HASH- See Also:
-
PENDING_STATE_ENT_SEEN_HASH_X
protected static final int PENDING_STATE_ENT_SEEN_HASH_X- See Also:
-
PENDING_STATE_ENT_IN_DEC_DIGIT
protected static final int PENDING_STATE_ENT_IN_DEC_DIGIT- See Also:
-
PENDING_STATE_ENT_IN_HEX_DIGIT
protected static final int PENDING_STATE_ENT_IN_HEX_DIGIT- See Also:
-
PENDING_STATE_ATTR_VALUE_AMP
protected static final int PENDING_STATE_ATTR_VALUE_AMP- See Also:
-
PENDING_STATE_ATTR_VALUE_AMP_HASH
protected static final int PENDING_STATE_ATTR_VALUE_AMP_HASH- See Also:
-
PENDING_STATE_ATTR_VALUE_AMP_HASH_X
protected static final int PENDING_STATE_ATTR_VALUE_AMP_HASH_X- See Also:
-
PENDING_STATE_ATTR_VALUE_ENTITY_NAME
protected static final int PENDING_STATE_ATTR_VALUE_ENTITY_NAME- See Also:
-
PENDING_STATE_ATTR_VALUE_DEC_DIGIT
protected static final int PENDING_STATE_ATTR_VALUE_DEC_DIGIT- See Also:
-
PENDING_STATE_ATTR_VALUE_HEX_DIGIT
protected static final int PENDING_STATE_ATTR_VALUE_HEX_DIGIT- See Also:
-
PENDING_STATE_TEXT_AMP
protected static final int PENDING_STATE_TEXT_AMP- See Also:
-
PENDING_STATE_TEXT_AMP_HASH
protected static final int PENDING_STATE_TEXT_AMP_HASH- See Also:
-
PENDING_STATE_TEXT_DEC_ENTITY
protected static final int PENDING_STATE_TEXT_DEC_ENTITY- See Also:
-
PENDING_STATE_TEXT_HEX_ENTITY
protected static final int PENDING_STATE_TEXT_HEX_ENTITY- See Also:
-
PENDING_STATE_TEXT_IN_ENTITY
protected static final int PENDING_STATE_TEXT_IN_ENTITY- See Also:
-
PENDING_STATE_TEXT_BRACKET1
protected static final int PENDING_STATE_TEXT_BRACKET1- See Also:
-
PENDING_STATE_TEXT_BRACKET2
protected static final int PENDING_STATE_TEXT_BRACKET2- See Also:
-
_charTypes
This is a simple container object that is used to access the decoding tables for characters. Indirection is needed since we actually support multiple utf-8 compatible encodings, not just utf-8 itself.NOTE: non-final due to xml declaration handling occurring later.
-
_symbols
For now, symbol table contains prefixed names. In future it is possible that they may be split into prefixes and local names?NOTE: non-final for async scanners
-
_quadBuffer
protected int[] _quadBufferThis buffer is used for name parsing. Will be expanded if/as needed; 32 ints can hold names 128 ascii chars long. -
_nextEvent
protected int _nextEventDue to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete. Type of that event is stored here. -
_state
protected int _stateIn addition to the event type, there is need for additional state information -
_surroundingEvent
protected int _surroundingEventFor token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained. -
_pendingInput
protected int _pendingInputThere are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs. Since they can be split across input buffer boundaries, first byte(s) may need to be temporarily stored.If so, this int will store byte(s), in little-endian format (that is, first pending byte is at 0x000000FF, second [if any] at 0x0000FF00, and third at 0x00FF0000). This can be (and is) used to figure out actual number of bytes pending, for multi-byte (UTF-8) character decoding.
Note: it is assumed that if value is 0, there is no data. Thus, if 0 needed to be added pending, it has to be masked.
-
_endOfInput
protected boolean _endOfInputFlag that is sent when calling application indicates that there will be no more input to parse. -
_quadCount
protected int _quadCountNumber of complete quads parsed for current name (quads themselves are stored in_quadBuffer). -
_currQuad
protected int _currQuadBytes parsed for the current, incomplete, quad -
_currQuadBytes
protected int _currQuadBytesNumber of bytes pending/buffered, stored in_currQuad -
_entityValue
protected int _entityValueEntity value accumulated so far -
_elemAllNsBound
protected boolean _elemAllNsBound -
_elemAttrCount
protected boolean _elemAttrCount -
_elemAttrQuote
protected byte _elemAttrQuote -
_elemAttrName
-
_elemAttrPtr
protected int _elemAttrPtrPointer for the next character of currently being parsed value within attribute value buffer -
_elemNsPtr
protected int _elemNsPtrPointer for the next character of currently being parsed namespace URI for the current namespace declaration -
_inDtdDeclaration
protected boolean _inDtdDeclarationFlag that indicates whether we are inside a declaration during parsing of internal DTD subset.
-
-
Constructor Details
-
AsyncByteScanner
-
-
Method Details
-
_activateEncoding
protected void _activateEncoding()Initialization method to call when encoding has been definitely figured out, from XML declarations, or, from lack of one (using defaults).- Since:
- 1.1.1
-
endOfInput
public void endOfInput()Description copied from interface:AsyncInputFeederMethod that should be called after last chunk of data to parse has been fed. May be called regardless of whatAsyncInputFeeder.needMoreInput()returns. After calling this method, no more data can be fed; and parser assumes no more data will be available.- Specified by:
endOfInputin interfaceAsyncInputFeeder
-
_releaseBuffers
protected void _releaseBuffers()- Overrides:
_releaseBuffersin classXmlScanner
-
_closeSource
Since the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close. But we can at least mark input as having ended.- Specified by:
_closeSourcein classByteBasedScanner- Throws:
IOException
-
verifyAndSetXmlVersion
- Throws:
XMLStreamException
-
verifyAndSetXmlEncoding
- Throws:
XMLStreamException
-
verifyAndSetXmlStandalone
- Throws:
XMLStreamException
-
verifyAndSetPublicId
- Throws:
XMLStreamException
-
verifyAndSetSystemId
- Throws:
XMLStreamException
-
_currentByte
- Throws:
XMLStreamException
-
_nextByte
- Throws:
XMLStreamException
-
_prevByte
- Throws:
XMLStreamException
-
handlePI
- Throws:
XMLStreamException
-
handleDTDInternalSubset
- Throws:
XMLStreamException
-
handleComment
- Throws:
XMLStreamException
-
handleStartElementStart
- Throws:
XMLStreamException
-
handleStartElement
- Throws:
XMLStreamException
-
parsePName
- Throws:
XMLStreamException
-
parseNewName
- Throws:
XMLStreamException
-
asyncSkipSpace
- Throws:
XMLStreamException
-
handlePartialCR
- Throws:
XMLStreamException
-
finishToken
Description copied from class:XmlScannerThis method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)- Specified by:
finishTokenin classXmlScanner- Throws:
XMLStreamException
-
startCharacters
Method called to initialize state for CHARACTERS event, after just a single byte has been seen. What needs to be done next depends on whether coalescing mode is set or not: if it is not set, just a single character needs to be decoded, after which current event will be incomplete, but defined as CHARACTERS. In coalescing mode, the whole content must be read before current event can be defined. The reason for difference is that whenXMLStreamReader.next()returns, no blocking can occur when calling other methods.- Returns:
- Event type detected; either CHARACTERS, if at least one full character was decoded (and can be returned), EVENT_INCOMPLETE if not (part of a multi-byte character split across input buffer boundary)
- Throws:
XMLStreamException
-
handleAttrValue
- Throws:
XMLStreamException
-
handleNsDecl
- Throws:
XMLStreamException
-
finishCData
- Specified by:
finishCDatain classXmlScanner- Throws:
XMLStreamException
-
finishComment
- Specified by:
finishCommentin classXmlScanner- Throws:
XMLStreamException
-
finishDTD
- Specified by:
finishDTDin classXmlScanner- Throws:
XMLStreamException
-
finishPI
- Specified by:
finishPIin classXmlScanner- Throws:
XMLStreamException
-
finishSpace
- Specified by:
finishSpacein classXmlScanner- Throws:
XMLStreamException
-
skipCharacters
- Specified by:
skipCharactersin classXmlScanner- Returns:
- True if the whole characters segment was succesfully skipped; false if not
- Throws:
XMLStreamException
-
skipCData
- Specified by:
skipCDatain classXmlScanner- Throws:
XMLStreamException
-
skipComment
- Specified by:
skipCommentin classXmlScanner- Throws:
XMLStreamException
-
skipPI
- Specified by:
skipPIin classXmlScanner- Throws:
XMLStreamException
-
skipSpace
- Specified by:
skipSpacein classXmlScanner- Throws:
XMLStreamException
-
loadMore
- Specified by:
loadMorein classXmlScanner- Throws:
XMLStreamException
-
finishCharacters
- Specified by:
finishCharactersin classXmlScanner- Throws:
XMLStreamException
-
findPName
Method called to process a sequence of bytes that is likely to be a PName. At this point we encountered an end marker, and may either hit a formerly seen well-formed PName; an as-of-yet unseen well-formed PName; or a non-well-formed sequence (containing one or more non-name chars without any valid end markers).- Parameters:
lastQuad- Word with last 0 to 3 bytes of the PName; not included in the quad arraylastByteCount- Number of bytes contained in lastQuad; 0 to 3.- Throws:
XMLStreamException
-
addPName
protected final PName addPName(ByteBasedPNameTable symbols, int hash, int[] quads, int qlen, int lastQuadBytes) throws XMLStreamException - Throws:
XMLStreamException
-
verifyAndAppendEntityCharacter
Method called to verify validity of given character (from entity) and append it to the text buffer- Throws:
XMLStreamException
-
validPublicIdChar
protected boolean validPublicIdChar(int c) Checks that a character for a PublicId- Parameters:
c- A character- Returns:
- true if the character is valid for use in the Public ID of an XML doctype declaration
- See Also:
-
decodeCharForError
Description copied from class:ByteBasedScannerMethod called by methods when encountering a byte that can not be part of a valid character in the current context. Should return the actual decoded character for error reporting purposes.- Specified by:
decodeCharForErrorin classByteBasedScanner- Throws:
XMLStreamException
-
checkPITargetName
- Throws:
XMLStreamException
-
throwInternal
protected int throwInternal() -
reportInvalidOther
- Throws:
XMLStreamException
-
nextFromProlog
- Specified by:
nextFromPrologin classXmlScanner- Throws:
XMLStreamException
-
_startDocumentNoXmlDecl
Helper method called when it is determined that the document does NOT start with an xml declaration. Needs to return START_DOCUMENT, and initialize other state appropriately.- Throws:
XMLStreamException
-
handlePrologDeclStart
- Throws:
XMLStreamException
-
startXmlDeclaration
Method that deals with recognizing XML declaration, but not with parsing its contents.- Returns:
- null if parsing is inconclusive (may or may not be XML declaration); Boolean.TRUE if complete XML declaration, and Boolean.FALSE if something else
- Throws:
XMLStreamException
-
handleXmlDeclaration
Method called to complete parsing of XML declaration, once it has been reliably detected.- Returns:
- Completed token (START_DOCUMENT), if fully parsed; incomplete (EVENT_INCOMPLETE) otherwise
- Throws:
XMLStreamException
-
handleDTD
- Throws:
XMLStreamException
-
parseDtdId
private final boolean parseDtdId(char[] outputBuffer, int outputPtr, boolean system) throws XMLStreamException - Throws:
XMLStreamException
-
_parseNewXmlDeclName
- Throws:
XMLStreamException
-
_parseXmlDeclName
- Throws:
XMLStreamException
-
_findXmlDeclName
- Throws:
XMLStreamException
-
parseXmlDeclAttr
Method called to try to parse an XML pseudo-attribute value. This is relatively simple, since we can't have linefeeds or entities; and although there are exact rules for what is allowed, we can do coarse parsing and only later on verify validity (for encoding could do stricter parsing in future?)NOTE: pseudo-attribute values required to be 7-bit ASCII so can do crude cast.
- Returns:
- True if we managed to parse the whole pseudo-attribute
- Throws:
XMLStreamException
-