Package org.htmlunit.html.parser.neko
Class HtmlUnitNekoDOMBuilder
java.lang.Object
org.htmlunit.cyberneko.xerces.parsers.XMLParser
org.htmlunit.cyberneko.xerces.parsers.AbstractXMLDocumentParser
org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
org.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder
- All Implemented Interfaces:
org.htmlunit.cyberneko.HTMLTagBalancingListener,org.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler,HTMLParserDOMBuilder,ContentHandler,LexicalHandler,XMLReader
final class HtmlUnitNekoDOMBuilder
extends org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
implements ContentHandler, LexicalHandler, org.htmlunit.cyberneko.HTMLTagBalancingListener, HTMLParserDOMBuilder
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
The parser and DOM builder. This class subclasses Xerces's AbstractSAXParser and implements the ContentHandler interface. Thus all parser APIs are kept private. The ContentHandler methods consume SAX events to build the page DOM
The parser and DOM builder. This class subclasses Xerces's AbstractSAXParser and implements the ContentHandler interface. Thus all parser APIs are kept private. The ContentHandler methods consume SAX events to build the page DOM
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser.AttributesProxy, org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser.LocatorProxy -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate HtmlElementprivate final org.htmlunit.cyberneko.xerces.xni.XMLStringprivate HtmlFormprivate final booleanprivate DomNodeprivate static final Stringprivate static final Stringprivate booleanprivate static final org.htmlunit.cyberneko.HTMLElementsprivate static final org.htmlunit.cyberneko.HTMLElementsprivate final HTMLParserprivate final intprivate booleanprivate booleanprivate booleanprivate Locatorprivate final HtmlPageprivate booleanDid the snippet tried to overwrite the start node?Fields inherited from class org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
fContentHandler, fDTDHandler, fLexicalHandler, fLexicalHandlerParameterEntities, fNamespaceContext, fNamespacePrefixes, fNamespaces, fStandalone, fUseEntityResolver2, fVersion, LEXICAL_HANDLER, NAMESPACESFields inherited from class org.htmlunit.cyberneko.xerces.parsers.XMLParser
ERROR_HANDLER, parserConfiguration_ -
Constructor Summary
ConstructorsConstructorDescriptionHtmlUnitNekoDOMBuilder(HTMLParser htmlParser, DomNode node, URL url, String htmlContent, boolean createdByJavascript) Creates a new builder for parsing the specified response contents. -
Method Summary
Modifier and TypeMethodDescriptionprivate voidaddNodeToRightParent(DomNode currentNode, DomElement newElement) Adds the new node to the right parent that is not necessary the currentNode in case of malformed HTML code.private static voidappendChild(DomNode parent, DomNode child) voidcharacters(char[] ch, int start, int length) voidcomment(char[] ch, int start, int length) private static voidcopyAttributes(DomElement to, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs) private static org.htmlunit.cyberneko.xerces.xni.parser.XMLParserConfigurationcreateConfiguration(BrowserVersion browserVersion) Create the configuration depending on the simulated browservoidendCDATA()voidvoidendDTD()voidendElement(String namespaceURI, String localName, String qName) voidendElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) voidvoidendPrefixMapping(String prefix) private DomNodefindElementOnStack(String... searchedElementNames) (package private) HtmlElementgetBody()private voidPicks up the character data accumulated so far and add it to the current element as a text node.voidignorableWhitespace(char[] ch, int start, int length) voidignoredEndElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) voidignoredStartElement(org.htmlunit.cyberneko.xerces.xni.QName elem, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) private static booleanisSynthesized(org.htmlunit.cyberneko.xerces.xni.Augmentations augs) private static booleanisTableCell(String nodeName) private static booleanisTableChild(String nodeName) voidparse(org.htmlunit.cyberneko.xerces.xni.parser.XMLInputSource inputSource) voidprocessingInstruction(String target, String data) voidpushInputString(String html) Parses and then inserts the specified HTML content into the HTML content currently being parsed.voidsetDocumentLocator(Locator locator) voidskippedEntity(String name) voidvoidvoidvoidstartElement(String namespaceURI, String localName, String qName, Attributes atts) voidstartElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attributes, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) voidstartEntity(String name) voidstartPrefixMapping(String prefix, String uri) Methods inherited from class org.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser
characters, comment, doctypeDecl, endCDATA, endDocument, endNamespaceMapping, getContentHandler, getDTDHandler, getEntityResolver, getErrorHandler, getFeature, getLexicalHandler, getProperty, parse, parse, processingInstruction, reset, setContentHandler, setDTDHandler, setEntityResolver, setErrorHandler, setFeature, setLexicalHandler, setProperty, startCDATA, startDocument, startNamespaceMapping, xmlDeclMethods inherited from class org.htmlunit.cyberneko.xerces.parsers.AbstractXMLDocumentParser
emptyElement, getDocumentSource, setDocumentSourceMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.xml.sax.ContentHandler
declaration
-
Field Details
-
HTMLELEMENTS
private static final org.htmlunit.cyberneko.HTMLElements HTMLELEMENTS -
HTMLELEMENTS_WITH_CMD
private static final org.htmlunit.cyberneko.HTMLElements HTMLELEMENTS_WITH_CMD -
htmlParser_
-
page_
-
locator_
-
stack_
-
snippetStartNodeOverwritten_
private boolean snippetStartNodeOverwritten_Did the snippet tried to overwrite the start node? -
initialSize_
private final int initialSize_ -
currentNode_
-
createdByJavascript_
private final boolean createdByJavascript_ -
characters_
private final org.htmlunit.cyberneko.xerces.xni.XMLString characters_ -
headParsed_
-
body_
-
lastTagWasSynthesized_
private boolean lastTagWasSynthesized_ -
consumingForm_
-
formEndingIsAdjusting_
private boolean formEndingIsAdjusting_ -
insideSvg_
private boolean insideSvg_ -
insideTemplate_
private boolean insideTemplate_ -
FEATURE_AUGMENTATIONS
- See Also:
-
FEATURE_PARSE_NOSCRIPT
- See Also:
-
-
Constructor Details
-
HtmlUnitNekoDOMBuilder
HtmlUnitNekoDOMBuilder(HTMLParser htmlParser, DomNode node, URL url, String htmlContent, boolean createdByJavascript) Creates a new builder for parsing the specified response contents.- Parameters:
node- the location at which to insert the new contenturl- the page's URLcreatedByJavascript- if true the (script) tag was created by javascript
-
-
Method Details
-
pushInputString
Parses and then inserts the specified HTML content into the HTML content currently being parsed.- Specified by:
pushInputStringin interfaceHTMLParserDOMBuilder- Parameters:
html- the HTML content to push
-
createConfiguration
private static org.htmlunit.cyberneko.xerces.xni.parser.XMLParserConfiguration createConfiguration(BrowserVersion browserVersion) Create the configuration depending on the simulated browser- Returns:
- the configuration
-
setDocumentLocator
- Specified by:
setDocumentLocatorin interfaceContentHandler
-
startDocument
- Specified by:
startDocumentin interfaceContentHandler- Throws:
SAXException
-
startElement
public void startElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attributes, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) throws org.htmlunit.cyberneko.xerces.xni.XNIException - Specified by:
startElementin interfaceorg.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler- Overrides:
startElementin classorg.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser- Throws:
org.htmlunit.cyberneko.xerces.xni.XNIException
-
startElement
public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws SAXException - Specified by:
startElementin interfaceContentHandler- Throws:
SAXException
-
addNodeToRightParent
Adds the new node to the right parent that is not necessary the currentNode in case of malformed HTML code. The method tries to emulate the behavior of Firefox. -
findElementOnStack
-
isTableChild
-
isTableCell
-
endElement
public void endElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) throws org.htmlunit.cyberneko.xerces.xni.XNIException - Specified by:
endElementin interfaceorg.htmlunit.cyberneko.xerces.xni.XMLDocumentHandler- Overrides:
endElementin classorg.htmlunit.cyberneko.xerces.parsers.AbstractSAXParser- Throws:
org.htmlunit.cyberneko.xerces.xni.XNIException
-
endElement
- Specified by:
endElementin interfaceContentHandler- Throws:
SAXException
-
characters
- Specified by:
charactersin interfaceContentHandler- Throws:
SAXException
-
ignorableWhitespace
- Specified by:
ignorableWhitespacein interfaceContentHandler- Throws:
SAXException
-
handleCharacters
private void handleCharacters()Picks up the character data accumulated so far and add it to the current element as a text node. -
endDocument
- Specified by:
endDocumentin interfaceContentHandler- Throws:
SAXException
-
startPrefixMapping
- Specified by:
startPrefixMappingin interfaceContentHandler- Throws:
SAXException
-
endPrefixMapping
- Specified by:
endPrefixMappingin interfaceContentHandler- Throws:
SAXException
-
processingInstruction
- Specified by:
processingInstructionin interfaceContentHandler- Throws:
SAXException
-
skippedEntity
- Specified by:
skippedEntityin interfaceContentHandler- Throws:
SAXException
-
comment
public void comment(char[] ch, int start, int length) - Specified by:
commentin interfaceLexicalHandler
-
endCDATA
public void endCDATA()- Specified by:
endCDATAin interfaceLexicalHandler
-
endDTD
public void endDTD()- Specified by:
endDTDin interfaceLexicalHandler
-
endEntity
- Specified by:
endEntityin interfaceLexicalHandler
-
startCDATA
public void startCDATA()- Specified by:
startCDATAin interfaceLexicalHandler
-
startDTD
- Specified by:
startDTDin interfaceLexicalHandler
-
startEntity
- Specified by:
startEntityin interfaceLexicalHandler
-
ignoredEndElement
public void ignoredEndElement(org.htmlunit.cyberneko.xerces.xni.QName element, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) - Specified by:
ignoredEndElementin interfaceorg.htmlunit.cyberneko.HTMLTagBalancingListener
-
ignoredStartElement
public void ignoredStartElement(org.htmlunit.cyberneko.xerces.xni.QName elem, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs, org.htmlunit.cyberneko.xerces.xni.Augmentations augs) - Specified by:
ignoredStartElementin interfaceorg.htmlunit.cyberneko.HTMLTagBalancingListener
-
copyAttributes
private static void copyAttributes(DomElement to, org.htmlunit.cyberneko.xerces.xni.XMLAttributes attrs) -
parse
public void parse(org.htmlunit.cyberneko.xerces.xni.parser.XMLInputSource inputSource) throws org.htmlunit.cyberneko.xerces.xni.XNIException, IOException - Overrides:
parsein classorg.htmlunit.cyberneko.xerces.parsers.XMLParser- Throws:
org.htmlunit.cyberneko.xerces.xni.XNIExceptionIOException
-
getBody
HtmlElement getBody() -
isSynthesized
private static boolean isSynthesized(org.htmlunit.cyberneko.xerces.xni.Augmentations augs) -
appendChild
-