Package nu.validator.htmlparser.io
Class Driver
java.lang.Object
nu.validator.htmlparser.io.Driver
- All Implemented Interfaces:
EncodingDeclarationHandler
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate booleanprivate Encodingprivate CharacterHandler[]Used for NFC checking if non-null, source code capture, etc.private Confidenceprivate Heuristicsprivate ReaderThe input UTF-16 code unit stream.private RewindableInputStreamThe reference to the rewindable byte stream.private booleanprivate final Tokenizer -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidaddCharacterHandler(CharacterHandler characterHandler) private void(package private) voidprotected EncodingencodingFromExternalDeclaration(String encoding) Initializes a decoder from external decl.Queries the environment for the encoding in use (for error reporting).booleaninternalEncodingDeclaration(String internalCharset) Indicates that the parser has found an internal encoding declaration with the charset valuecharset.booleanReturns the allowRewinding.booleanQuery if checking normalization.(package private) voidprivate voidvoidsetAllowRewinding(boolean allowRewinding) Sets the allowRewinding.voidsetCheckingNormalization(boolean enable) Turns NFC checking on or off.voidsetCommentPolicy(XmlViolationPolicy commentPolicy) voidsetContentNonXmlCharPolicy(XmlViolationPolicy contentNonXmlCharPolicy) voidsetContentSpacePolicy(XmlViolationPolicy contentSpacePolicy) voidsetEncoding(Encoding encoding, Confidence confidence) voidvoidsetHeuristics(Heuristics heuristics) Sets the encoding sniffing heuristics.voidsetHtml4ModeCompatibleWithXhtml1Schemata(boolean html4ModeCompatibleWithXhtml1Schemata) voidsetMappingLangToXmlLang(boolean mappingLangToXmlLang) voidsetNamePolicy(XmlViolationPolicy namePolicy) voidsetTransitionHandler(TransitionHandler transitionHandler) voidsetXmlnsPolicy(XmlViolationPolicy xmlnsPolicy) voidtokenize(InputSource is) Runs the tokenization.protected voidwarnWithoutLocation(String message) Reports a warning without line/colprotected EncodingwhineAboutEncodingAndReturnActual(String encoding, Encoding cs)
-
Field Details
-
reader
The input UTF-16 code unit stream. If a byte stream was given, this object is an instance ofHtmlInputStreamReader. -
rewindableInputStream
The reference to the rewindable byte stream.nullif p rohibited or no longer needed. -
swallowBom
private boolean swallowBom -
characterEncoding
-
allowRewinding
private boolean allowRewinding -
heuristics
-
tokenizer
-
confidence
-
characterHandlers
Used for NFC checking if non-null, source code capture, etc.
-
-
Constructor Details
-
Driver
-
-
Method Details
-
isAllowRewinding
public boolean isAllowRewinding()Returns the allowRewinding.- Returns:
- the allowRewinding
-
setAllowRewinding
public void setAllowRewinding(boolean allowRewinding) Sets the allowRewinding.- Parameters:
allowRewinding- the allowRewinding to set
-
setCheckingNormalization
public void setCheckingNormalization(boolean enable) Turns NFC checking on or off.- Parameters:
enable-trueif checking on
-
addCharacterHandler
-
isCheckingNormalization
public boolean isCheckingNormalization()Query if checking normalization.- Returns:
trueif checking on
-
tokenize
Runs the tokenization. This is the main entry point.- Parameters:
is- the input source- Throws:
SAXException- on fatal error (if configured to treat XML violations as fatal) or if the token handler threwIOException- if the stream threw
-
dontSwallowBom
void dontSwallowBom() -
runStates
- Throws:
SAXExceptionIOException
-
setEncoding
-
internalEncodingDeclaration
Description copied from interface:EncodingDeclarationHandlerIndicates that the parser has found an internal encoding declaration with the charset valuecharset.- Specified by:
internalEncodingDeclarationin interfaceEncodingDeclarationHandler- Parameters:
internalCharset- the charset name found.- Returns:
trueif the value ofcharsetwas an encoding name for a supported ASCII-superset encoding.- Throws:
SAXException- if something went wrong
-
becomeConfident
private void becomeConfident() -
setHeuristics
Sets the encoding sniffing heuristics.- Parameters:
heuristics- the heuristics to set
-
warnWithoutLocation
Reports a warning without line/col- Parameters:
message- the message- Throws:
SAXException
-
encodingFromExternalDeclaration
Initializes a decoder from external decl.- Throws:
SAXException
-
whineAboutEncodingAndReturnActual
protected Encoding whineAboutEncodingAndReturnActual(String encoding, Encoding cs) throws SAXException - Parameters:
encoding-cs-- Returns:
- Throws:
SAXException
-
notifyAboutMetaBoundary
void notifyAboutMetaBoundary() -
setCommentPolicy
- Parameters:
commentPolicy-- See Also:
-
setContentNonXmlCharPolicy
- Parameters:
contentNonXmlCharPolicy-- See Also:
-
setContentSpacePolicy
- Parameters:
contentSpacePolicy-- See Also:
-
setErrorHandler
- Parameters:
eh-- See Also:
-
setTransitionHandler
-
setHtml4ModeCompatibleWithXhtml1Schemata
public void setHtml4ModeCompatibleWithXhtml1Schemata(boolean html4ModeCompatibleWithXhtml1Schemata) - Parameters:
html4ModeCompatibleWithXhtml1Schemata-- See Also:
-
setMappingLangToXmlLang
public void setMappingLangToXmlLang(boolean mappingLangToXmlLang) - Parameters:
mappingLangToXmlLang-- See Also:
-
setNamePolicy
- Parameters:
namePolicy-- See Also:
-
setXmlnsPolicy
- Parameters:
xmlnsPolicy-- See Also:
-
getCharacterEncoding
Description copied from interface:EncodingDeclarationHandlerQueries the environment for the encoding in use (for error reporting).- Specified by:
getCharacterEncodingin interfaceEncodingDeclarationHandler- Returns:
- the encoding in use
- Throws:
SAXException- if something went wrong
-
getDocumentLocator
-