Class NormalizationChecker
java.lang.Object
nu.validator.htmlparser.extra.NormalizationChecker
- All Implemented Interfaces:
CharacterHandler
- Version:
- $Id$
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate booleanIndicates whether the current run has already caused an error.private booleanIndicates whether the checker the next call tocharacters()is the first call in a run.private char[]A buffer for holding sequences overlap the SAX buffer boundary.private char[]A holder for the original buffer (for the memory leak prevention mechanism).private static final com.ibm.icu.text.UnicodeSetA thread-safe set of composing characters as per Charmod Norm.private ErrorHandlerprivate Locatorprivate intThe current used length of the buffer, i.e. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate voidappendToBuf(char[] ch, int start, int end) Appends a slice of an UTF-16 code unit array to the internal buffer.voidcharacters(char[] ch, int start, int length) Receive notification of a run of UTF-16 code units.voidend()Signals the end of the stream.voidEmit an error.private voidEmits an error stating that the current text run or the source text is not in NFC.private static booleanisComposingChar(int c) Returnstrueif the argument is a composing character andfalseotherwise.private static booleanisComposingCharOrSurrogate(char c) Returnstrueif the argument is a composing BMP character or a surrogate andfalseotherwise.voidsetErrorHandler(ErrorHandler errorHandler) voidstart()Signals the start of the stream.
-
Field Details
-
errorHandler
-
locator
-
COMPOSING_CHARACTERS
private static final com.ibm.icu.text.UnicodeSet COMPOSING_CHARACTERSA thread-safe set of composing characters as per Charmod Norm. -
buf
private char[] bufA buffer for holding sequences overlap the SAX buffer boundary. -
bufHolder
private char[] bufHolderA holder for the original buffer (for the memory leak prevention mechanism). -
pos
private int posThe current used length of the buffer, i.e. the index of the first slot that does not hold current data. -
atStartOfRun
private boolean atStartOfRunIndicates whether the checker the next call tocharacters()is the first call in a run. -
alreadyComplainedAboutThisRun
private boolean alreadyComplainedAboutThisRunIndicates whether the current run has already caused an error.
-
-
Constructor Details
-
NormalizationChecker
Constructor with mode selection.- Parameters:
sourceTextMode- whether the source text-related messages should be enabled.
-
-
Method Details
-
err
Emit an error. The locator is used.- Parameters:
message- the error message- Throws:
SAXException- if something goes wrong
-
isComposingCharOrSurrogate
private static boolean isComposingCharOrSurrogate(char c) Returnstrueif the argument is a composing BMP character or a surrogate andfalseotherwise.- Parameters:
c- a UTF-16 code unit- Returns:
trueif the argument is a composing BMP character or a surrogate andfalseotherwise
-
isComposingChar
private static boolean isComposingChar(int c) Returnstrueif the argument is a composing character andfalseotherwise.- Parameters:
c- a Unicode code point- Returns:
trueif the argument is a composing characterfalseotherwise
-
start
public void start()Description copied from interface:CharacterHandlerSignals the start of the stream. Can be used for setup.- Specified by:
startin interfaceCharacterHandler- See Also:
-
characters
Description copied from interface:CharacterHandlerReceive notification of a run of UTF-16 code units.- Specified by:
charactersin interfaceCharacterHandler- Parameters:
ch- the bufferstart- start index in the bufferlength- the number of characters to process starting fromstart- Throws:
SAXException- if things go wrong- See Also:
-
errAboutTextRun
Emits an error stating that the current text run or the source text is not in NFC.- Throws:
SAXException- if theErrorHandlerthrows
-
appendToBuf
private void appendToBuf(char[] ch, int start, int end) Appends a slice of an UTF-16 code unit array to the internal buffer.- Parameters:
ch- the array from which to copystart- the index of the first element that is copiedend- the index of the first element that is not copied
-
end
Description copied from interface:CharacterHandlerSignals the end of the stream. Can be used for cleanup. Doesn't mean that the stream ended successfully.- Specified by:
endin interfaceCharacterHandler- Throws:
SAXException- if things go wrong- See Also:
-
setErrorHandler
-