Class SimpleXMLParser
java.lang.Object
com.lowagie.text.xml.simpleparser.SimpleXMLParser
Deprecated.
A simple XML and HTML parser. This parser is, like the SAX parser, an event based parser, but with much less
functionality.
The parser can:
- It recognizes the encoding used
- It recognizes all the elements' start tags and end tags
- It lists attributes, where attribute values can be enclosed in single or double quotes
- It recognizes the
<[CDATA[ ... ]]>construct - It recognizes the standard entities: &, <, >, ", and ', as well as numeric entities
- It maps lines ending in
\r\nand\rto\non input, in accordance with the XML Specification, Section 2.11
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final intDeprecated.private static final intDeprecated.private static final intDeprecated.(package private) StringDeprecated.the attribute key.Deprecated.current attributes(package private) StringDeprecated.the attribute value.private static final intDeprecated.(package private) intDeprecated.The current character.(package private) intDeprecated.the column where the current character occurs(package private) SimpleXMLDocHandlerCommentDeprecated.The handler to which we are going to forward comments.private static final intDeprecated.(package private) SimpleXMLDocHandlerDeprecated.The handler to which we are going to forward document content(package private) StringBufferDeprecated.current entity (whatever is encountered between invalid input: '&' and ;)private static final intDeprecated.(package private) booleanDeprecated.was the last character equivalent to a newline?private static final intDeprecated.(package private) booleanDeprecated.Are we parsing HTML?private static final intDeprecated.(package private) intDeprecated.the line we are currently reading(package private) intDeprecated.Keeps track of the number of tags that are open.(package private) booleanDeprecated.A boolean indicating if the next character should be taken into account if it's a space character.private static final intDeprecated.(package private) intDeprecated.The previous character.private static final intDeprecated.(package private) intDeprecated.the quote character that was used to open the quote.private static final intDeprecated.Deprecated.the state stack(package private) intDeprecated.the current state(package private) StringDeprecated.current tagnameprivate static final intDeprecated.private static final intDeprecated.(package private) StringBufferDeprecated.current text (whatever is encountered between tags)private static final intDeprecated.private static final intDeprecated.possible states -
Constructor Summary
ConstructorsModifierConstructorDescriptionprivateSimpleXMLParser(SimpleXMLDocHandler doc, SimpleXMLDocHandlerComment comment, boolean html) Deprecated.Creates a Simple XML parser object. -
Method Summary
Modifier and TypeMethodDescriptiondetectCharsetFromBOM(byte[] bom) Deprecated.Detect charset from BOM, as per Unicode FAQ.private voiddoTag()Deprecated.Sets the name of the tag.private voidflush()Deprecated.Flushes the text that is currently in the buffer.private static StringgetDeclaredEncoding(String decl) Deprecated.private voidDeprecated.Does the actual parsing.private voidinitTag()Deprecated.Initialized the tag name and attributes.static voidparse(SimpleXMLDocHandler doc, SimpleXMLDocHandlerComment comment, Reader r, boolean html) Deprecated.Parses the XML document firing the events to the handler.static voidparse(SimpleXMLDocHandler doc, InputStream in) Deprecated.Parses the XML document firing the events to the handler.static voidparse(SimpleXMLDocHandler doc, Reader r) Deprecated.private voidprocessTag(boolean start) Deprecated.processes the tag.private intDeprecated.Gets a state from the stackprivate voidsaveState(int s) Deprecated.Adds a state to the stack.private voidDeprecated.Throws an exception
-
Field Details
-
UNKNOWN
private static final int UNKNOWNDeprecated.possible states- See Also:
-
TEXT
private static final int TEXTDeprecated.- See Also:
-
TAG_ENCOUNTERED
private static final int TAG_ENCOUNTEREDDeprecated.- See Also:
-
EXAMIN_TAG
private static final int EXAMIN_TAGDeprecated.- See Also:
-
TAG_EXAMINED
private static final int TAG_EXAMINEDDeprecated.- See Also:
-
IN_CLOSETAG
private static final int IN_CLOSETAGDeprecated.- See Also:
-
SINGLE_TAG
private static final int SINGLE_TAGDeprecated.- See Also:
-
CDATA
private static final int CDATADeprecated.- See Also:
-
COMMENT
private static final int COMMENTDeprecated.- See Also:
-
PI
private static final int PIDeprecated.- See Also:
-
ENTITY
private static final int ENTITYDeprecated.- See Also:
-
QUOTE
private static final int QUOTEDeprecated.- See Also:
-
ATTRIBUTE_KEY
private static final int ATTRIBUTE_KEYDeprecated.- See Also:
-
ATTRIBUTE_EQUAL
private static final int ATTRIBUTE_EQUALDeprecated.- See Also:
-
ATTRIBUTE_VALUE
private static final int ATTRIBUTE_VALUEDeprecated.- See Also:
-
stack
-
character
int characterDeprecated.The current character. -
previousCharacter
int previousCharacterDeprecated.The previous character. -
lines
int linesDeprecated.the line we are currently reading -
columns
int columnsDeprecated.the column where the current character occurs -
eol
boolean eolDeprecated.was the last character equivalent to a newline? -
nowhite
boolean nowhiteDeprecated.A boolean indicating if the next character should be taken into account if it's a space character. When nospace is false, the previous character wasn't whitespace.- Since:
- 2.1.5
-
state
int stateDeprecated.the current state -
html
boolean htmlDeprecated.Are we parsing HTML? -
text
-
entity
StringBuffer entityDeprecated.current entity (whatever is encountered between invalid input: '&' and ;) -
tag
-
attributes
-
doc
-
comment
SimpleXMLDocHandlerComment commentDeprecated.The handler to which we are going to forward comments. -
nested
int nestedDeprecated.Keeps track of the number of tags that are open. -
quoteCharacter
int quoteCharacterDeprecated.the quote character that was used to open the quote. -
attributekey
-
attributevalue
-
-
Constructor Details
-
SimpleXMLParser
Deprecated.Creates a Simple XML parser object. Call go(BufferedReader) immediately after creation.
-
-
Method Details
-
parse
public static void parse(SimpleXMLDocHandler doc, SimpleXMLDocHandlerComment comment, Reader r, boolean html) throws IOException Deprecated.Parses the XML document firing the events to the handler.- Parameters:
doc- the document handlercomment-commentr- the document. The encoding is already resolved. The reader is not closedhtml-html- Throws:
IOException- on error
-
detectCharsetFromBOM
Deprecated.Detect charset from BOM, as per Unicode FAQ. -
parse
Deprecated.Parses the XML document firing the events to the handler.- Parameters:
doc- the document handlerin- the document. The encoding is deduced from the stream. The stream is not closed- Throws:
IOException- on error
-
getDeclaredEncoding
-
parse
Deprecated.- Throws:
IOException
-
go
Deprecated.Does the actual parsing. Perform this immediately after creating the parser object.- Throws:
IOException
-
restoreState
private int restoreState()Deprecated.Gets a state from the stack- Returns:
- the previous state
-
saveState
private void saveState(int s) Deprecated.Adds a state to the stack.- Parameters:
s- a state to add to the stack
-
flush
private void flush()Deprecated.Flushes the text that is currently in the buffer. The text can be ignored, added to the document as content or as comment,... depending on the current state. -
initTag
private void initTag()Deprecated.Initialized the tag name and attributes. -
doTag
private void doTag()Deprecated.Sets the name of the tag. -
processTag
private void processTag(boolean start) Deprecated.processes the tag.- Parameters:
start- if true we are dealing with a tag that has just been opened; if false we are closing a tag.
-
throwException
-