Class CssScanner
java.lang.Object
org.idpf.epubcheck.util.css.CssScanner
A lexical scanner for CSS.
Lexical errors are stored as attributes on the tokens in which they occurred. The supplied CssErrorHandler is also invoked when a lexical error occurs, so that clients can terminate the scanning by a rethrow.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static classMemoizer for escapes at forward reader positions. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final com.google.common.base.CharMatcherprivate static final int[]The three last characters in the AND tokenprivate CssToken.TokenBuilderprivate static final int[]The two last characters in the CDC tokenprivate static final int[]The three last characters in the CDO tokenprivate final CssToken.CssTokenConsumerprivate charprivate final booleanprivate final CssErrorHandlerprivate final CssScanner.CssEscapeMemoizer(package private) static final com.google.common.base.CharMatcherprivate Localeprivate static final com.google.common.base.CharMatcherprivate static final com.google.common.base.CharMatcher{nmchar} excluding {escape}private static final com.google.common.base.CharMatcher{nmstart} excluding {escape}private static final int[]The three last characters in the NOT tokenprivate static final com.google.common.base.CharMatcherprivate static final com.google.common.base.CharMatcher{num}private static final com.google.common.base.CharMatcher{num} end char cannot be periodprivate static final com.google.common.base.CharMatcher{num} start char can be unary operatorsprivate static final com.google.common.base.CharMatcherprivate static final int[]The three last characters in the ONLY tokenprivate static final intprivate static final com.google.common.base.CharMatcherstart of quantities that followed after {num} excluding {escape}private static final Map<int[], CssToken.Type> (package private) static final com.google.common.base.CharMatcherprivate final CssReader(package private) static final com.google.common.base.CharMatcherprivate static final com.google.common.base.CharMatcherprivate static final com.google.common.base.CharMatcherprivate static final com.google.common.base.CharMatcherprivate static final com.google.common.base.CharMatcherprivate static final int[]The three last characters in the URI start token(package private) static final com.google.common.base.CharMatcher -
Constructor Summary
ConstructorsModifierConstructorDescriptionprivateCssScanner(Reader in, String systemID, CssErrorHandler errHandler, CssToken.CssTokenConsumer consumer, int pushbackBufferSize, Locale locale) (package private)CssScanner(Reader in, String systemID, CssErrorHandler errHandler, CssToken.CssTokenConsumer consumer, Locale locale) -
Method Summary
Modifier and TypeMethodDescriptionprivate void_and()private voidATKEYWORD '@'[-]?{nmstart}{nmchar}* nmstart [_a-z]|{nonascii}|{escape} nmchar [_a-z0-9-]|{nonascii}|{escape}private void_cdc()private void_cdo()private voidCLASSNAME "."{name} This is not part of formal lexical constructs, but seems to be safe to do at scanner level.private void_comment()Builds a comment token, excluding the leading and trailing comment tokens.private voidDASHMATCH |=private voidFUNCTION {ident}\(private voidHASHNAME "#"{name} name {nmchar}+ [_a-z0-9-]|{nonascii}|{escape}private void_ident()IDENT ([-]?{nmstart}|[--]){nmchar}*private voidIMPORTANT !{w}importantprivate voidINCLUDES ~=private void_not()private void_num()private void_only()private voidPREFIXMATCH ^=private voidWith incoming builder containing a valid NUMBER, and next char being a valid QNTSTART, modify the type and append to the builderprivate void_string()string1 \"([^\n\r\f\\"]|\\{nl}|{escape})*\" string2 \'([^\n\r\f\\']|\\{nl}|{escape})*\'private voidSUBSTRINGMATCH *=private voidSUFFIXMATCH $=private void_urange()Builds a UNICODE_RANGE token.private void_uri()URI url\({w}{string}{w}\) | url\({w}([!#$%invalid input: '&'*-\[\]-~]|{nonascii}|{escape})*{w}\)private void_ws()Whitespace w ::= wc wc ::= #x9 | #xA | #xC | #xD | #x20private voidappend(com.google.common.base.CharMatcher matcher) Parse forward and append to the TokenBuilder field all characters that match matcher, or until the next character is EOF.private voidappend(com.google.common.base.CharMatcher matcher, CssToken.TokenBuilder builder) Parse forward and append to the supplied builder all characters that match matcher, or until the next character is EOF.private static booleanequals(int[] a, int[] b) Like Arrays.equals, but does not return true when both are null.private static booleanequals(int[] a, int[] b, boolean ignoreAsciiCase) Like Arrays.equals, but does not return true when both are null.private booleanforwardMatch(String match, boolean ignoreCase, boolean resetOnTrue) Check if a forward scan will equal given match string(package private) static intisNewLine(int[] chars) Determine whether a sequence of chars begin with a CSS newline.private booleanReturns true if reader next() is the start of a valid escape sequence.private static booleanmatches(int ch, com.google.common.base.CharMatcher matcher) Return true if ch matches matcher, false if not or if ch represents EOF (-1).private static booleanmatchesOrEOF(int ch, com.google.common.base.CharMatcher matcher) Return true if ch represents EOF (-1), or if it matches matcher.(package private) voidscan()private String
-
Field Details
-
reader
-
consumer
-
escapes
-
builder
-
errHandler
-
debug
private final boolean debug- See Also:
-
cur
private char cur -
locale
-
QNT_TOKEN_MAXLENGTH
private static final int QNT_TOKEN_MAXLENGTH- See Also:
-
quantities
-
WHITESPACE
static final com.google.common.base.CharMatcher WHITESPACE -
NOT_WHITESPACE
private static final com.google.common.base.CharMatcher NOT_WHITESPACE -
QUOTES
static final com.google.common.base.CharMatcher QUOTES -
U
private static final com.google.common.base.CharMatcher U -
O
private static final com.google.common.base.CharMatcher O -
N
private static final com.google.common.base.CharMatcher N -
A
private static final com.google.common.base.CharMatcher A -
NMSTART
private static final com.google.common.base.CharMatcher NMSTART{nmstart} excluding {escape} -
NMCHAR
private static final com.google.common.base.CharMatcher NMCHAR{nmchar} excluding {escape} -
QNTSTART
private static final com.google.common.base.CharMatcher QNTSTARTstart of quantities that followed after {num} excluding {escape} -
NUMEND
private static final com.google.common.base.CharMatcher NUMEND{num} end char cannot be period -
NUM
private static final com.google.common.base.CharMatcher NUM{num} -
UNARY
private static final com.google.common.base.CharMatcher UNARY -
NUMSTART
private static final com.google.common.base.CharMatcher NUMSTART{num} start char can be unary operators -
HEXCHAR
static final com.google.common.base.CharMatcher HEXCHAR -
URANGESTART
private static final com.google.common.base.CharMatcher URANGESTART -
URANGECHAR
private static final com.google.common.base.CharMatcher URANGECHAR -
TERMINATOR
static final com.google.common.base.CharMatcher TERMINATOR -
CDO_LL
private static final int[] CDO_LLThe three last characters in the CDO token -
CDC_LL
private static final int[] CDC_LLThe two last characters in the CDC token -
URI_LL
private static final int[] URI_LLThe three last characters in the URI start token -
ONLY_LL
private static final int[] ONLY_LLThe three last characters in the ONLY token -
NOT_LL
private static final int[] NOT_LLThe three last characters in the NOT token -
AND_LL
private static final int[] AND_LLThe three last characters in the AND token
-
-
Constructor Details
-
CssScanner
private CssScanner(Reader in, String systemID, CssErrorHandler errHandler, CssToken.CssTokenConsumer consumer, int pushbackBufferSize, Locale locale) -
CssScanner
CssScanner(Reader in, String systemID, CssErrorHandler errHandler, CssToken.CssTokenConsumer consumer, Locale locale)
-
-
Method Details
-
scan
-
_function
-
_uri
URI url\({w}{string}{w}\) | url\({w}([!#$%invalid input: '&'*-\[\]-~]|{nonascii}|{escape})*{w}\) -
_string
string1 \"([^\n\r\f\\"]|\\{nl}|{escape})*\" string2 \'([^\n\r\f\\']|\\{nl}|{escape})*\' -
_atkeyword
ATKEYWORD '@'[-]?{nmstart}{nmchar}* nmstart [_a-z]|{nonascii}|{escape} nmchar [_a-z0-9-]|{nonascii}|{escape} -
_ident
IDENT ([-]?{nmstart}|[--]){nmchar}* -
_dashmatch
-
_includes
-
_prefixmatch
-
_suffixmatch
-
_substringmatch
-
_hashname
HASHNAME "#"{name} name {nmchar}+ [_a-z0-9-]|{nonascii}|{escape} -
_classname
CLASSNAME "."{name} This is not part of formal lexical constructs, but seems to be safe to do at scanner level. name {nmchar}+ [_a-z0-9-]|{nonascii}|{escape} -
_important
private void _important()IMPORTANT !{w}important -
_comment
Builds a comment token, excluding the leading and trailing comment tokens. -
_cdo
- Throws:
IOException
-
_num
-
_quantity
With incoming builder containing a valid NUMBER, and next char being a valid QNTSTART, modify the type and append to the builder -
_and
- Throws:
IOException
-
_not
- Throws:
IOException
-
_only
- Throws:
IOException
-
_cdc
- Throws:
IOException
-
_ws
Whitespace w ::= wc wc ::= #x9 | #xA | #xC | #xD | #x20- Throws:
IOException
-
_urange
Builds a UNICODE_RANGE token. -
isNextEscape
Returns true if reader next() is the start of a valid escape sequence.- Returns:
- whether or not the reader is at the start of a valid escape sequence.
- Throws:
IOException
-
append
private void append(com.google.common.base.CharMatcher matcher) throws IOException, CssExceptions.CssException Parse forward and append to the TokenBuilder field all characters that match matcher, or until the next character is EOF. Escapes are included verbatim if they don't match matcher, else literal. -
append
private void append(com.google.common.base.CharMatcher matcher, CssToken.TokenBuilder builder) throws IOException, CssExceptions.CssException Parse forward and append to the supplied builder all characters that match matcher, or until the next character is EOF. Escapes are included verbatim if they don't match matcher, else literal. -
forwardMatch
private boolean forwardMatch(String match, boolean ignoreCase, boolean resetOnTrue) throws IOException Check if a forward scan will equal given match string- Parameters:
match- The string to matchignoreCase- Whether case should be ignoredresetOnTrue- Whether the reader should be reset on found match- Throws:
IOException
-
toString
-
equals
private static boolean equals(int[] a, int[] b) Like Arrays.equals, but does not return true when both are null. -
equals
private static boolean equals(int[] a, int[] b, boolean ignoreAsciiCase) Like Arrays.equals, but does not return true when both are null.- Parameters:
ignoreAsciiCase- If true, ascii case differences are ignored.
-
matchesOrEOF
private static boolean matchesOrEOF(int ch, com.google.common.base.CharMatcher matcher) Return true if ch represents EOF (-1), or if it matches matcher. -
matches
private static boolean matches(int ch, com.google.common.base.CharMatcher matcher) Return true if ch matches matcher, false if not or if ch represents EOF (-1). -
isNewLine
static int isNewLine(int[] chars) Determine whether a sequence of chars begin with a CSS newline.- Parameters:
chars- An array with minimum two characters- Returns:
- 0 if there is no newline, else 1 or 2, representing the newline length in characters.
-