Package org.htmlunit.html.parser.neko
Class HtmlUnitNekoDOMBuilder
- java.lang.Object
-
- org.htmlunit.cyberneko.xerces.parsers.XMLParser
-
- org.htmlunit.cyberneko.xerces.parsers.AbstractXMLDocumentParser
-
- org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
-
- org.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder
-
- All Implemented Interfaces:
org.htmlunit.cyberneko.HTMLTagBalancingListener,org.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler,HTMLParserDOMBuilder,org.xml.sax.ContentHandler,org.xml.sax.ext.LexicalHandler,org.xml.sax.XMLReader
final class HtmlUnitNekoDOMBuilder extends org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser implements org.xml.sax.ContentHandler, org.xml.sax.ext.LexicalHandler, org.htmlunit.cyberneko.HTMLTagBalancingListener, HTMLParserDOMBuilder
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
The parser and DOM builder. This class subclasses Xerces's AbstractSAXParser and implements the ContentHandler interface. Thus all parser APIs are kept private. The ContentHandler methods consume SAX events to build the page DOM
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static classHtmlUnitNekoDOMBuilder.HeadParsed
-
Field Summary
Fields Modifier and Type Field Description private HtmlElementbody_private org.htmlunit.cyberneko.xerces.xni.XMLStringcharacters_private HtmlFormconsumingForm_private booleancreatedByJavascript_private DomNodecurrentNode_private static java.lang.StringFEATURE_AUGMENTATIONSprivate static java.lang.StringFEATURE_PARSE_NOSCRIPTprivate booleanformEndingIsAdjusting_private HtmlUnitNekoDOMBuilder.HeadParsedheadParsed_private static org.htmlunit.cyberneko.HTMLElementsHTMLELEMENTSprivate static org.htmlunit.cyberneko.HTMLElementsHTMLELEMENTS_WITH_CMDprivate HTMLParserhtmlParser_private intinitialSize_private booleaninsideSvg_private booleaninsideTemplate_private booleanlastTagWasSynthesized_private org.xml.sax.Locatorlocator_private HtmlPagepage_private booleansnippetStartNodeOverwritten_Did the snippet tried to overwrite the start node?private java.util.Deque<DomNode>stack_
-
Constructor Summary
Constructors Constructor Description HtmlUnitNekoDOMBuilder(HTMLParser htmlParser, DomNode node, java.net.URL url, java.lang.String htmlContent, boolean createdByJavascript)Creates a new builder for parsing the specified response contents.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private voidaddNodeToRightParent(DomNode currentNode, DomElement newElement)Adds the new node to the right parent that is not necessary the currentNode in case of malformed HTML code.private static voidappendChild(DomNode parent, DomNode child)voidcharacters(char[] ch, int start, int length)voidcomment(char[] ch, int start, int length)private static voidcopyAttributes(DomElement to, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs)private static org.htmlunit.cyberneko.xerces.xni.parser.XMLParserConfigurationcreateConfiguration(BrowserVersion browserVersion)Create the configuration depending on the simulated browservoidendCDATA()voidendDocument()voidendDTD()voidendElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)voidendElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs)voidendEntity(java.lang.String name)voidendPrefixMapping(java.lang.String prefix)private DomNodefindElementOnStack(java.lang.String... searchedElementNames)(package private) HtmlElementgetBody()private voidhandleCharacters()Picks up the character data accumulated so far and add it to the current element as a text node.voidignorableWhitespace(char[] ch, int start, int length)voidignoredEndElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs)voidignoredStartElement(org.htmlunit.cyberneko.xerces.xni.QName elem, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs, org.htmlunit.cyberneko.xerces.xni.Augmentations augs)private static booleanisSynthesized(org.htmlunit.cyberneko.xerces.xni.Augmentations augs)private static booleanisTableCell(java.lang.String nodeName)private static booleanisTableChild(java.lang.String nodeName)voidparse(org.htmlunit.cyberneko.xerces.xni.parser.XMLInputSource inputSource)voidprocessingInstruction(java.lang.String target, java.lang.String data)voidpushInputString(java.lang.String html)Parses and then inserts the specified HTML content into the HTML content currently being parsed.voidsetDocumentLocator(org.xml.sax.Locator locator)voidskippedEntity(java.lang.String name)voidstartCDATA()voidstartDocument()voidstartDTD(java.lang.String name, java.lang.String publicId, java.lang.String systemId)voidstartElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes atts)voidstartElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attributes, org.htmlunit.cyberneko.xerces.xni.Augmentations augs)voidstartEntity(java.lang.String name)voidstartPrefixMapping(java.lang.String prefix, java.lang.String uri)-
Methods inherited from class org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
characters, comment, doctypeDecl, endCDATA, endDocument, endNamespaceMapping, getContentHandler, getDTDHandler, getEntityResolver, getErrorHandler, getFeature, getLexicalHandler, getProperty, parse, parse, processingInstruction, reset, setContentHandler, setDTDHandler, setEntityResolver, setErrorHandler, setFeature, setLexicalHandler, setProperty, startCDATA, startDocument, startNamespaceMapping, xmlDecl
-
-
-
-
Field Detail
-
HTMLELEMENTS
private static final org.htmlunit.cyberneko.HTMLElements HTMLELEMENTS
-
HTMLELEMENTS_WITH_CMD
private static final org.htmlunit.cyberneko.HTMLElements HTMLELEMENTS_WITH_CMD
-
htmlParser_
private final HTMLParser htmlParser_
-
page_
private final HtmlPage page_
-
locator_
private org.xml.sax.Locator locator_
-
stack_
private final java.util.Deque<DomNode> stack_
-
snippetStartNodeOverwritten_
private boolean snippetStartNodeOverwritten_
Did the snippet tried to overwrite the start node?
-
initialSize_
private final int initialSize_
-
currentNode_
private DomNode currentNode_
-
createdByJavascript_
private final boolean createdByJavascript_
-
characters_
private final org.htmlunit.cyberneko.xerces.xni.XMLString characters_
-
headParsed_
private HtmlUnitNekoDOMBuilder.HeadParsed headParsed_
-
body_
private HtmlElement body_
-
lastTagWasSynthesized_
private boolean lastTagWasSynthesized_
-
consumingForm_
private HtmlForm consumingForm_
-
formEndingIsAdjusting_
private boolean formEndingIsAdjusting_
-
insideSvg_
private boolean insideSvg_
-
insideTemplate_
private boolean insideTemplate_
-
FEATURE_AUGMENTATIONS
private static final java.lang.String FEATURE_AUGMENTATIONS
- See Also:
- Constant Field Values
-
FEATURE_PARSE_NOSCRIPT
private static final java.lang.String FEATURE_PARSE_NOSCRIPT
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
HtmlUnitNekoDOMBuilder
HtmlUnitNekoDOMBuilder(HTMLParser htmlParser, DomNode node, java.net.URL url, java.lang.String htmlContent, boolean createdByJavascript)
Creates a new builder for parsing the specified response contents.- Parameters:
node- the location at which to insert the new contenturl- the page's URLcreatedByJavascript- if true the (script) tag was created by javascript
-
-
Method Detail
-
pushInputString
public void pushInputString(java.lang.String html)
Parses and then inserts the specified HTML content into the HTML content currently being parsed.- Specified by:
pushInputStringin interfaceHTMLParserDOMBuilder- Parameters:
html- the HTML content to push
-
createConfiguration
private static org.htmlunit.cyberneko.xerces.xni.parser.XMLParserConfiguration createConfiguration(BrowserVersion browserVersion)
Create the configuration depending on the simulated browser- Returns:
- the configuration
-
setDocumentLocator
public void setDocumentLocator(org.xml.sax.Locator locator)
- Specified by:
setDocumentLocatorin interfaceorg.xml.sax.ContentHandler
-
startDocument
public void startDocument() throws org.xml.sax.SAXException- Specified by:
startDocumentin interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
startElement
public void startElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attributes, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) throws org.htmlunit.cyberneko.xerces.xni.XNIException- Specified by:
startElementin interfaceorg.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler- Overrides:
startElementin classorg.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser- Throws:
org.htmlunit.cyberneko.xerces.xni.XNIException
-
startElement
public void startElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes atts) throws org.xml.sax.SAXException- Specified by:
startElementin interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
addNodeToRightParent
private void addNodeToRightParent(DomNode currentNode, DomElement newElement)
Adds the new node to the right parent that is not necessary the currentNode in case of malformed HTML code. The method tries to emulate the behavior of Firefox.
-
findElementOnStack
private DomNode findElementOnStack(java.lang.String... searchedElementNames)
-
isTableChild
private static boolean isTableChild(java.lang.String nodeName)
-
isTableCell
private static boolean isTableCell(java.lang.String nodeName)
-
endElement
public void endElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) throws org.htmlunit.cyberneko.xerces.xni.XNIException- Specified by:
endElementin interfaceorg.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler- Overrides:
endElementin classorg.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser- Throws:
org.htmlunit.cyberneko.xerces.xni.XNIException
-
endElement
public void endElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName) throws org.xml.sax.SAXException- Specified by:
endElementin interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
characters
public void characters(char[] ch, int start, int length) throws org.xml.sax.SAXException- Specified by:
charactersin interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
ignorableWhitespace
public void ignorableWhitespace(char[] ch, int start, int length) throws org.xml.sax.SAXException- Specified by:
ignorableWhitespacein interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
handleCharacters
private void handleCharacters()
Picks up the character data accumulated so far and add it to the current element as a text node.
-
endDocument
public void endDocument() throws org.xml.sax.SAXException- Specified by:
endDocumentin interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
startPrefixMapping
public void startPrefixMapping(java.lang.String prefix, java.lang.String uri) throws org.xml.sax.SAXException- Specified by:
startPrefixMappingin interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
endPrefixMapping
public void endPrefixMapping(java.lang.String prefix) throws org.xml.sax.SAXException- Specified by:
endPrefixMappingin interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
processingInstruction
public void processingInstruction(java.lang.String target, java.lang.String data) throws org.xml.sax.SAXException- Specified by:
processingInstructionin interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
skippedEntity
public void skippedEntity(java.lang.String name) throws org.xml.sax.SAXException- Specified by:
skippedEntityin interfaceorg.xml.sax.ContentHandler- Throws:
org.xml.sax.SAXException
-
comment
public void comment(char[] ch, int start, int length)- Specified by:
commentin interfaceorg.xml.sax.ext.LexicalHandler
-
endCDATA
public void endCDATA()
- Specified by:
endCDATAin interfaceorg.xml.sax.ext.LexicalHandler
-
endDTD
public void endDTD()
- Specified by:
endDTDin interfaceorg.xml.sax.ext.LexicalHandler
-
endEntity
public void endEntity(java.lang.String name)
- Specified by:
endEntityin interfaceorg.xml.sax.ext.LexicalHandler
-
startCDATA
public void startCDATA()
- Specified by:
startCDATAin interfaceorg.xml.sax.ext.LexicalHandler
-
startDTD
public void startDTD(java.lang.String name, java.lang.String publicId, java.lang.String systemId)- Specified by:
startDTDin interfaceorg.xml.sax.ext.LexicalHandler
-
startEntity
public void startEntity(java.lang.String name)
- Specified by:
startEntityin interfaceorg.xml.sax.ext.LexicalHandler
-
ignoredEndElement
public void ignoredEndElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs)- Specified by:
ignoredEndElementin interfaceorg.htmlunit.cyberneko.HTMLTagBalancingListener
-
ignoredStartElement
public void ignoredStartElement(org.htmlunit.cyberneko.xerces.xni.QName elem, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs, org.htmlunit.cyberneko.xerces.xni.Augmentations augs)- Specified by:
ignoredStartElementin interfaceorg.htmlunit.cyberneko.HTMLTagBalancingListener
-
copyAttributes
private static void copyAttributes(DomElement to, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs)
-
parse
public void parse(org.htmlunit.cyberneko.xerces.xni.parser.XMLInputSource inputSource) throws org.htmlunit.cyberneko.xerces.xni.XNIException, java.io.IOException- Overrides:
parsein classorg.htmlunit.cyberneko.xerces.parsers.XMLParser- Throws:
org.htmlunit.cyberneko.xerces.xni.XNIExceptionjava.io.IOException
-
getBody
HtmlElement getBody()
-
isSynthesized
private static boolean isSynthesized(org.htmlunit.cyberneko.xerces.xni.Augmentations augs)
-
-