Class PullParser
java.lang.Object
org.gjt.xpp.impl.pullparser.PullParser
- All Implemented Interfaces:
XmlPullParser, XmlPullParserBufferControl, XmlPullParserEventPosition
public class PullParser
extends Object
implements XmlPullParser, XmlPullParserBufferControl, XmlPullParserEventPosition
XML Pull Parser (XPP) allows to pull XML events from input stream.
Advantages:
- very simple pull interface - ideal for deserializing XML objects (like SOAP)
- simple and efficient thin wrapper around Tokenizer class - when compared with using Tokenizer directly adds about 10% for big documents, maximum 50% more processing time for small documents
- lightweight memory model - minimized memory allocation: element content and attributes are only read on explicit method calls, both StartTag and EndTag can be reused during parsing
- small - total compiled size around 20K
- by default supports namespaces parsing (can be switched off)
- support for mixed content can be explicitly disabled
- this is beta version - may have still bugs :-)
- does not parse DTD (recognizes only predefined entities)
- Author:
- Aleksander Slominski
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected Attribute[]temporary array of current attributesprotected intindex for last attribute in attrPos arrayprotected intsize of attrPos arrayprotected static final booleanShould attribute uniqueness be checked for attributes as in specified XML and NS specifications?protected StringContent of current element if in CONTENT stateprotected ElementContent[]temprary array to keep ElementContent stackprotected inthow many elements are on elStackprotected intsize of elStack arrayprotected booleanHave we read empty element?protected intend position of current event in tokenizer bifferprotected intstart position of current event in tokenizer bifferprotected Hashtablemapping of names prefixes to urisprotected booleanshould parser report namespace xmlns* attributes ?protected booleanHave we seen root elementprotected bytewhat is current event type as returned from next()?protected booleanshould parser support namespaces?protected bytewhat is current token returned from tokeizerprotected TokenizerXML tokenizer that is doing actual tokenizning of input stream.protected static final booleanFields inherited from interface XmlPullParser
CONTENT, END_DOCUMENT, END_TAG, START_TAG -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidensureAttribs(int size) Make sure that in attributes temporary array is enough space.protected voidensureCapacity(int size) Make sure that we have enough space to keep element stack if passed size.intintintReturn how big is content.intgetDepth()Returns the current depth of the element.char[]NOTE: This may be internal buffer and is valud only until call to method next()- do NOT attempt modify !intintbyteReturns the type of the current element (START_TAG, END_TAG, CONTENT, etc)intintReturns the local name of the current element (current event must be START_TAG or END_TAG)intgetNamespacesLength(int depth) Returns the namespace URI of the current element Returns null if not applicable (current event must be START_TAG or END_TAG)Return string describing current position of parser in input stream.Returns the prefix of the current element or null if elemet has no prefix.getQNameLocal(String qName) Return local part of qname.getQNameUri(String qName) Return uri part of qname.Returns the raw name (prefix + ':' + localName) of the current element (current event must be START_TAG or END_TAG)intbooleanIs mixed element context allowed?booleanbooleanIs parser going to report namespace attributes (xmlns*) ?booleanIs parser namespace aware?booleanReturn true if just read CONTENT contained only white spaces.bytenext()This is key method - it reads more from input stream and returns next event type (such as START_TAG, END_TAG, CONTENT).Return String that contains just read CONTENT.voidreadEndTag(XmlEndTag etag) Read value of just read END_TAG into passed as argument EndTag.voidreadNamespacesPrefixes(int depth, String[] prefixes, int off, int len) Return namespace prefixes for element at depthvoidreadNamespacesUris(int depth, String[] uris, int off, int len) Return namespace URIs for element at depthbyteRead subtree into node: call readNodeWithoutChildren and then parse subtree adding children (values obtained with readXontent or readNodeWithoutChildren).voidRead node: it calls readStartTag and then if parser is namespaces aware currently declared nemaspeces will be added and defaultNamespace will be set.voidreadStartTag(XmlStartTag stag) Read value of just read START_TAG into passed as argument StartTag.voidreset()Reset parser state so it can be used to parse newprotected voidvoidsetAllowedMixedContent(boolean enable) Allow for mixed element content.voidsetBufferShrinkable(boolean shrinkable) voidsetHardLimit(int value) voidsetInput(char[] buf) Reset parser and set new input.voidsetInput(char[] buf, int off, int len) Set the input for parser.voidReset parser and set new input.voidsetNamespaceAttributesReporting(boolean enable) Make parser to report xmlns* attributes.voidsetNamespaceAware(boolean awareness) Set support of namespaces.voidsetSoftLimit(int value) byteskipNode()If parser has just read start tag it allows to skip whoole subtree contined in this element.
-
Field Details
-
USE_QNAMEBUF
protected static final boolean USE_QNAMEBUF- See Also:
-
CHECK_ATTRIB_UNIQ
protected static final boolean CHECK_ATTRIB_UNIQShould attribute uniqueness be checked for attributes as in specified XML and NS specifications?- See Also:
-
emptyElement
protected boolean emptyElementHave we read empty element? -
seenRootElement
protected boolean seenRootElementHave we seen root element -
elContent
Content of current element if in CONTENT state -
tokenizer
XML tokenizer that is doing actual tokenizning of input stream. -
eventStart
protected int eventStartstart position of current event in tokenizer biffer -
eventEnd
protected int eventEndend position of current event in tokenizer biffer -
state
protected byte statewhat is current event type as returned from next()? -
token
protected byte tokenwhat is current token returned from tokeizer -
supportNs
protected boolean supportNsshould parser support namespaces? -
reportNsAttribs
protected boolean reportNsAttribsshould parser report namespace xmlns* attributes ? -
prefix2Ns
mapping of names prefixes to uris -
attrPosEnd
protected int attrPosEndindex for last attribute in attrPos array -
attrPosSize
protected int attrPosSizesize of attrPos array -
attrPos
temporary array of current attributes -
elStackDepth
protected int elStackDepthhow many elements are on elStack -
elStackSize
protected int elStackSizesize of elStack array -
elStack
temprary array to keep ElementContent stack
-
-
Constructor Details
-
PullParser
public PullParser()Create instance of pull parser.
-
-
Method Details
-
setInput
Reset parser and set new input.- Specified by:
setInputin interfaceXmlPullParser
-
setInput
public void setInput(char[] buf) Reset parser and set new input.- Specified by:
setInputin interfaceXmlPullParser
-
setInput
Description copied from interface:XmlPullParserSet the input for parser.- Specified by:
setInputin interfaceXmlPullParser- Throws:
XmlPullParserException
-
reset
public void reset()Reset parser state so it can be used to parse new- Specified by:
resetin interfaceXmlPullParser
-
isAllowedMixedContent
public boolean isAllowedMixedContent()Description copied from interface:XmlPullParserIs mixed element context allowed?- Specified by:
isAllowedMixedContentin interfaceXmlPullParser
-
setAllowedMixedContent
public void setAllowedMixedContent(boolean enable) Allow for mixed element content. Enabled by default. When disbaled element must containt either text or other elements.- Specified by:
setAllowedMixedContentin interfaceXmlPullParser
-
isNamespaceAware
public boolean isNamespaceAware()Description copied from interface:XmlPullParserIs parser namespace aware?- Specified by:
isNamespaceAwarein interfaceXmlPullParser
-
setNamespaceAware
Set support of namespaces. Disabled by default.- Specified by:
setNamespaceAwarein interfaceXmlPullParser- Throws:
XmlPullParserException
-
isNamespaceAttributesReporting
public boolean isNamespaceAttributesReporting()Description copied from interface:XmlPullParserIs parser going to report namespace attributes (xmlns*) ?- Specified by:
isNamespaceAttributesReportingin interfaceXmlPullParser
-
setNamespaceAttributesReporting
public void setNamespaceAttributesReporting(boolean enable) Make parser to report xmlns* attributes. Disabled by default. Only meaningful when namespaces are enabled (when namespaces are disabled all attributes are always reported).- Specified by:
setNamespaceAttributesReportingin interfaceXmlPullParser
-
getNamespaceUri
Description copied from interface:XmlPullParserReturns the namespace URI of the current element Returns null if not applicable (current event must be START_TAG or END_TAG)- Specified by:
getNamespaceUriin interfaceXmlPullParser
-
getLocalName
Description copied from interface:XmlPullParserReturns the local name of the current element (current event must be START_TAG or END_TAG)- Specified by:
getLocalNamein interfaceXmlPullParser
-
getPrefix
Description copied from interface:XmlPullParserReturns the prefix of the current element or null if elemet has no prefix. (current event must be START_TAG or END_TAG)- Specified by:
getPrefixin interfaceXmlPullParser
-
getRawName
Description copied from interface:XmlPullParserReturns the raw name (prefix + ':' + localName) of the current element (current event must be START_TAG or END_TAG)- Specified by:
getRawNamein interfaceXmlPullParser
-
getQNameLocal
Description copied from interface:XmlPullParserReturn local part of qname. For example for 'xsi:type' it returns 'type'.- Specified by:
getQNameLocalin interfaceXmlPullParser
-
getQNameUri
Description copied from interface:XmlPullParserReturn uri part of qname. It is depending on current state of parser to find what namespace uri is mapped from namespace prefix. For example for 'xsi:type' if xsi namespace prefix was declared to 'urn:foo' it will return 'urn:foo'.- Specified by:
getQNameUriin interfaceXmlPullParser- Throws:
XmlPullParserException
-
getDepth
public int getDepth()Description copied from interface:XmlPullParserReturns the current depth of the element.- Specified by:
getDepthin interfaceXmlPullParser
-
getNamespacesLength
public int getNamespacesLength(int depth) - Specified by:
getNamespacesLengthin interfaceXmlPullParser
-
readNamespacesPrefixes
public void readNamespacesPrefixes(int depth, String[] prefixes, int off, int len) throws XmlPullParserException Return namespace prefixes for element at depth- Specified by:
readNamespacesPrefixesin interfaceXmlPullParser- Throws:
XmlPullParserException
-
readNamespacesUris
public void readNamespacesUris(int depth, String[] uris, int off, int len) throws XmlPullParserException Return namespace URIs for element at depth- Specified by:
readNamespacesUrisin interfaceXmlPullParser- Throws:
XmlPullParserException
-
getPosDesc
Return string describing current position of parser in input stream.- Specified by:
getPosDescin interfaceXmlPullParser
-
getLineNumber
public int getLineNumber()- Specified by:
getLineNumberin interfaceXmlPullParser
-
getColumnNumber
public int getColumnNumber()- Specified by:
getColumnNumberin interfaceXmlPullParser
-
next
This is key method - it reads more from input stream and returns next event type (such as START_TAG, END_TAG, CONTENT). or END_DOCUMENT if no more input.This is simple automata (in pseudo-code):
byte next() { while(state != END_DOCUMENT) { token = tokenizer.next(); // get next XML token switch(token) { case Tokenizer.END_DOCUMENT: return state = END_DOCUMENT case Tokenizer.CONTENT: // check if content allowed - only inside element return state = CONTENT case Tokenizer.ETAG_NAME: // popup element from stack - compare if matched start and end tag // if namespaces supported restore namespaces prefix mappings return state = END_TAG; case Tokenizer.STAG_NAME: // create new element push it on stack // process attributes (including namespaces) // set emptyElement = true; if empty element // check atribute uniqueness (including nmespacese prefixes) return state = START_TAG; } } }Actual parsing is more complex especilly for start tag due to dealing with attributes reported separately from tokenizer and declaring namespace prefixes and uris.
- Specified by:
nextin interfaceXmlPullParser- Throws:
XmlPullParserExceptionIOException
-
getEventType
public byte getEventType()Description copied from interface:XmlPullParserReturns the type of the current element (START_TAG, END_TAG, CONTENT, etc)- Specified by:
getEventTypein interfaceXmlPullParser
-
isWhitespaceContent
Return true if just read CONTENT contained only white spaces.- Specified by:
isWhitespaceContentin interfaceXmlPullParser- Throws:
XmlPullParserException
-
getContentLength
Description copied from interface:XmlPullParserReturn how big is content.NOTE: parser must be on CONTENT event.
- Specified by:
getContentLengthin interfaceXmlPullParser- Throws:
XmlPullParserException
-
readContent
Return String that contains just read CONTENT.- Specified by:
readContentin interfaceXmlPullParser- Throws:
XmlPullParserException
-
readEndTag
Read value of just read END_TAG into passed as argument EndTag.- Specified by:
readEndTagin interfaceXmlPullParser- Throws:
XmlPullParserException
-
readStartTag
Read value of just read START_TAG into passed as argument StartTag.- Specified by:
readStartTagin interfaceXmlPullParser- Throws:
XmlPullParserException
-
readNodeWithoutChildren
Description copied from interface:XmlPullParserRead node: it calls readStartTag and then if parser is namespaces aware currently declared nemaspeces will be added and defaultNamespace will be set.NOTE: parser must be on START_TAG event. and all events will written into node!
- Specified by:
readNodeWithoutChildrenin interfaceXmlPullParser- Throws:
XmlPullParserException
-
readNode
Description copied from interface:XmlPullParserRead subtree into node: call readNodeWithoutChildren and then parse subtree adding children (values obtained with readXontent or readNodeWithoutChildren).NOTE: parser must be on START_TAG event. and all events will written into node!
- Specified by:
readNodein interfaceXmlPullParser- Throws:
XmlPullParserExceptionIOException
-
skipNode
If parser has just read start tag it allows to skip whoole subtree contined in this element. Returns when encounters end tag matching the start tag.- Specified by:
skipNodein interfaceXmlPullParser- Throws:
XmlPullParserExceptionIOException
-
getHardLimit
public int getHardLimit()- Specified by:
getHardLimitin interfaceXmlPullParserBufferControl
-
setHardLimit
- Specified by:
setHardLimitin interfaceXmlPullParserBufferControl- Throws:
XmlPullParserException
-
getSoftLimit
public int getSoftLimit()- Specified by:
getSoftLimitin interfaceXmlPullParserBufferControl
-
setSoftLimit
- Specified by:
setSoftLimitin interfaceXmlPullParserBufferControl- Throws:
XmlPullParserException
-
getBufferShrinkOffset
public int getBufferShrinkOffset()- Specified by:
getBufferShrinkOffsetin interfaceXmlPullParserBufferControl
-
setBufferShrinkable
- Specified by:
setBufferShrinkablein interfaceXmlPullParserBufferControl- Throws:
XmlPullParserException
-
isBufferShrinkable
public boolean isBufferShrinkable()- Specified by:
isBufferShrinkablein interfaceXmlPullParserBufferControl
-
getEventStart
public int getEventStart()- Specified by:
getEventStartin interfaceXmlPullParserEventPosition
-
getEventEnd
public int getEventEnd()- Specified by:
getEventEndin interfaceXmlPullParserEventPosition
-
getEventBuffer
public char[] getEventBuffer()Description copied from interface:XmlPullParserEventPositionNOTE: This may be internal buffer and is valud only until call to method next()- do NOT attempt modify !
- Specified by:
getEventBufferin interfaceXmlPullParserEventPosition
-
ensureCapacity
protected void ensureCapacity(int size) Make sure that we have enough space to keep element stack if passed size. -
ensureAttribs
protected void ensureAttribs(int size) Make sure that in attributes temporary array is enough space. -
resetState
protected void resetState()
-