Class HTMLConfiguration
java.lang.Object
org.htmlunit.cyberneko.xerces.util.ParserConfigurationSettings
org.htmlunit.cyberneko.HTMLConfiguration
- All Implemented Interfaces:
XMLComponentManager, XMLParserConfiguration
public class HTMLConfiguration
extends ParserConfigurationSettings
implements XMLParserConfiguration
An XNI-based parser configuration that can be used to parse HTML
documents. This configuration can be used directly in order to
parse HTML documents or can be used in conjunction with any XNI
based tools, such as the Xerces2 implementation.
This configuration recognizes the following features:
- http://cyberneko.org/html/features/augmentations
- http://cyberneko.org/html/features/report-errors
- http://cyberneko.org/html/features/report-errors/simple
- and
- the features supported by the scanner and tag balancer components.
This configuration recognizes the following properties:
- http://cyberneko.org/html/properties/names/elems
- http://cyberneko.org/html/properties/names/attrs
- http://cyberneko.org/html/properties/filters
- http://cyberneko.org/html/properties/error-reporter
- and
- the properties supported by the scanner and tag balancer.
For complete usage information, refer to the documentation.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected classDefines an error reporter for reporting HTML errors. -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected static final StringInclude infoset augmentations.private booleanStream opened by parser.private XMLDocumentHandlerDocument handler.(package private) final HTMLScannerDocument scanner.protected static final StringError domain.protected static final StringError reporter.(package private) XMLErrorHandlerError handler.static final StringPipeline filters.private final List<HTMLComponent> Components.private final HTMLElementsprotected static final StringModify HTML attribute names: { "upper", "lower", "default" }.protected static final StringModify HTML element names: { "upper", "lower", "default" }.private final NamespaceBinderNamespace binder.protected static final StringNamespaces.protected static final StringReport errors.protected static final StringSimple report format.private final HTMLTagBalancerHTML tag balancer. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidaddComponent(HTMLComponent component) voidcleanup()If the application decides to terminate parsing before the xml document is fully parsed, the application should call this method to free any resource allocated during parsing.protected HTMLScannervoidevaluateInputSource(XMLInputSource inputSource) EXPERIMENTAL: may change in next release
Immediately evaluates an input source and add the new content (e.g.booleanparse(boolean complete) Parses the document in a pull parsing fashion.voidparse(XMLInputSource source) Parses a document.voidpushInputSource(XMLInputSource inputSource) Pushes an input source onto the current entity stack.protected voidreset()Resets the parser configuration.voidsetDocumentHandler(XMLDocumentHandler handler) Sets the document handler to receive information about the document.voidsetErrorHandler(XMLErrorHandler handler) Sets the error handler.voidsetFeature(String featureId, boolean state) Set the state of a feature.voidsetInputSource(XMLInputSource inputSource) Sets the input source for the document to parse.voidsetProperty(String propertyId, Object value) setPropertyMethods inherited from class ParserConfigurationSettings
addRecognizedFeatures, addRecognizedProperties, checkFeature, checkProperty, getFeature, getPropertyMethods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface XMLParserConfiguration
addRecognizedFeatures, addRecognizedProperties, getFeature, getProperty
-
Field Details
-
NAMESPACES
-
AUGMENTATIONS
-
REPORT_ERRORS
-
SIMPLE_ERROR_FORMAT
-
NAMES_ELEMS
Modify HTML element names: { "upper", "lower", "default" }.- See Also:
-
NAMES_ATTRS
Modify HTML attribute names: { "upper", "lower", "default" }.- See Also:
-
FILTERS
-
ERROR_REPORTER
-
ERROR_DOMAIN
-
documentHandler_
Document handler. -
errorHandler_
XMLErrorHandler errorHandler_Error handler. -
closeStream_
private boolean closeStream_Stream opened by parser. Therefore, must close stream manually upon termination of parsing. -
htmlComponents_
Components. -
documentScanner_
Document scanner. -
tagBalancer_
HTML tag balancer. -
namespaceBinder_
Namespace binder. -
htmlElements_
-
-
Constructor Details
-
HTMLConfiguration
public HTMLConfiguration()Default constructor. -
HTMLConfiguration
-
-
Method Details
-
createDocumentScanner
-
pushInputSource
Pushes an input source onto the current entity stack. This enables the scanner to transparently scan new content (e.g. the output written by an embedded script). At the end of the current entity, the scanner returns where it left off at the time this entity source was pushed.Hint: To use this feature to insert the output of <SCRIPT> tags, remember to buffer the entire output of the processed instructions before pushing a new input source. Otherwise, events may appear out of sequence.
- Parameters:
inputSource- The new input source to start scanning.- See Also:
-
evaluateInputSource
EXPERIMENTAL: may change in next release
Immediately evaluates an input source and add the new content (e.g. the output written by an embedded script).- Parameters:
inputSource- The new input source to start scanning.- See Also:
-
setFeature
Description copied from class:ParserConfigurationSettingsSet the state of a feature.Set the state of any feature in a SAX2 parser. The parser might not recognize the feature, and if it does recognize it, it might not be able to fulfill the request.
- Specified by:
setFeaturein interfaceXMLParserConfiguration- Overrides:
setFeaturein classParserConfigurationSettings- Parameters:
featureId- The unique identifier (URI) of the feature.state- The requested state of the feature (true or false).- Throws:
XMLConfigurationException- If the requested feature is not known.
-
setProperty
Description copied from class:ParserConfigurationSettingssetProperty- Specified by:
setPropertyin interfaceXMLParserConfiguration- Overrides:
setPropertyin classParserConfigurationSettings- Parameters:
propertyId- the property idvalue- the value- Throws:
XMLConfigurationException- If the requested feature is not known.
-
setDocumentHandler
Description copied from interface:XMLParserConfigurationSets the document handler to receive information about the document.- Specified by:
setDocumentHandlerin interfaceXMLParserConfiguration- Parameters:
handler- The document handler.
-
getDocumentHandler
- Specified by:
getDocumentHandlerin interfaceXMLParserConfiguration- Returns:
- the document handler.
-
setErrorHandler
Description copied from interface:XMLParserConfigurationSets the error handler.- Specified by:
setErrorHandlerin interfaceXMLParserConfiguration- Parameters:
handler- The error resolver.
-
getErrorHandler
- Specified by:
getErrorHandlerin interfaceXMLParserConfiguration- Returns:
- the error handler.
-
getHtmlElements
- Returns:
- the HTMLElements
-
getHtmlComponents
- Returns:
- the list of HTMLComponents
-
getDocumentScanner
- Returns:
- the DocumentScanner
-
getTagBalancer
- Returns:
- the TagBalancer
-
getNamespaceBinder
- Returns:
- the NamespaceBinder
-
parse
Parses a document.- Specified by:
parsein interfaceXMLParserConfiguration- Parameters:
source- The input source for the top-level of the XML document.- Throws:
XNIException- Any XNI exception, possibly wrapping another exception.IOException- An IO exception from the parser, possibly from a byte stream or character stream supplied by the parser.
-
setInputSource
public void setInputSource(XMLInputSource inputSource) throws XMLConfigurationException, IOException Sets the input source for the document to parse.- Specified by:
setInputSourcein interfaceXMLParserConfiguration- Parameters:
inputSource- The document's input source.- Throws:
XMLConfigurationException- Thrown if there is a configuration error when initializing the parser.IOException- Thrown on I/O error.- See Also:
-
parse
Parses the document in a pull parsing fashion.- Specified by:
parsein interfaceXMLParserConfiguration- Parameters:
complete- True if the pull parser should parse the remaining document completely.- Returns:
- True if there is more document to parse.
- Throws:
XNIException- Any XNI exception, possibly wrapping another exception.IOException- An IO exception from the parser, possibly from a byte stream or character stream supplied by the parser.- See Also:
-
cleanup
public void cleanup()If the application decides to terminate parsing before the xml document is fully parsed, the application should call this method to free any resource allocated during parsing. For example, close all opened streams.- Specified by:
cleanupin interfaceXMLParserConfiguration
-
addComponent
-
reset
Resets the parser configuration.- Throws:
XMLConfigurationException
-