Package org.idpf.epubcheck.util.css
Class CssScanner
- java.lang.Object
-
- org.idpf.epubcheck.util.css.CssScanner
-
final class CssScanner extends java.lang.ObjectA lexical scanner for CSS.Lexical errors are stored as attributes on the tokens in which they occurred. The supplied CssErrorHandler is also invoked when a lexical error occurs, so that clients can terminate the scanning by a rethrow.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static classCssScanner.CssEscapeMemoizerMemoizer for escapes at forward reader positions.
-
Field Summary
Fields Modifier and Type Field Description private static com.google.common.base.CharMatcherAprivate static int[]AND_LLThe three last characters in the AND tokenprivate CssToken.TokenBuilderbuilderprivate static int[]CDC_LLThe two last characters in the CDC tokenprivate static int[]CDO_LLThe three last characters in the CDO tokenprivate CssToken.CssTokenConsumerconsumerprivate charcurprivate booleandebugprivate CssErrorHandlererrHandlerprivate CssScanner.CssEscapeMemoizerescapes(package private) static com.google.common.base.CharMatcherHEXCHARprivate java.util.Localelocaleprivate static com.google.common.base.CharMatcherNprivate static com.google.common.base.CharMatcherNMCHAR{nmchar} excluding {escape}private static com.google.common.base.CharMatcherNMSTART{nmstart} excluding {escape}private static int[]NOT_LLThe three last characters in the NOT tokenprivate static com.google.common.base.CharMatcherNOT_WHITESPACEprivate static com.google.common.base.CharMatcherNUM{num}private static com.google.common.base.CharMatcherNUMEND{num} end char cannot be periodprivate static com.google.common.base.CharMatcherNUMSTART{num} start char can be unary operatorsprivate static com.google.common.base.CharMatcherOprivate static int[]ONLY_LLThe three last characters in the ONLY tokenprivate static intQNT_TOKEN_MAXLENGTHprivate static com.google.common.base.CharMatcherQNTSTARTstart of quantities that followed after {num} excluding {escape}private static java.util.Map<int[],CssToken.Type>quantities(package private) static com.google.common.base.CharMatcherQUOTESprivate CssReaderreader(package private) static com.google.common.base.CharMatcherTERMINATORprivate static com.google.common.base.CharMatcherUprivate static com.google.common.base.CharMatcherUNARYprivate static com.google.common.base.CharMatcherURANGECHARprivate static com.google.common.base.CharMatcherURANGESTARTprivate static int[]URI_LLThe three last characters in the URI start token(package private) static com.google.common.base.CharMatcherWHITESPACE
-
Constructor Summary
Constructors Modifier Constructor Description privateCssScanner(java.io.Reader in, java.lang.String systemID, CssErrorHandler errHandler, CssToken.CssTokenConsumer consumer, int pushbackBufferSize, java.util.Locale locale)(package private)CssScanner(java.io.Reader in, java.lang.String systemID, CssErrorHandler errHandler, CssToken.CssTokenConsumer consumer, java.util.Locale locale)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private void_and()private void_atkeyword()ATKEYWORD '@'[-]?{nmstart}{nmchar}* nmstart [_a-z]|{nonascii}|{escape} nmchar [_a-z0-9-]|{nonascii}|{escape}private void_cdc()private void_cdo()private void_classname()CLASSNAME "."{name} This is not part of formal lexical constructs, but seems to be safe to do at scanner level.private void_comment()Builds a comment token, excluding the leading and trailing comment tokens.private void_dashmatch()DASHMATCH |=private void_function()FUNCTION {ident}\(private void_hashname()HASHNAME "#"{name} name {nmchar}+ [_a-z0-9-]|{nonascii}|{escape}private void_ident()IDENT ([-]?{nmstart}|[--]){nmchar}*private void_important()IMPORTANT !{w}importantprivate void_includes()INCLUDES ~=private void_not()private void_num()private void_only()private void_prefixmatch()PREFIXMATCH ^=private void_quantity()With incoming builder containing a valid NUMBER, and next char being a valid QNTSTART, modify the type and append to the builderprivate void_string()string1 \"([^\n\r\f\\"]|\\{nl}|{escape})*\" string2 \'([^\n\r\f\\']|\\{nl}|{escape})*\'private void_substringmatch()SUBSTRINGMATCH *=private void_suffixmatch()SUFFIXMATCH $=private void_urange()Builds a UNICODE_RANGE token.private void_uri()URI url\({w}{string}{w}\) | url\({w}([!#$%&*-\[\]-~]|{nonascii}|{escape})*{w}\)private void_ws()Whitespace w ::= wc wc ::= #x9 | #xA | #xC | #xD | #x20private voidappend(com.google.common.base.CharMatcher matcher)Parse forward and append to the TokenBuilder field all characters that match matcher, or until the next character is EOF.private voidappend(com.google.common.base.CharMatcher matcher, CssToken.TokenBuilder builder)Parse forward and append to the supplied builder all characters that match matcher, or until the next character is EOF.private static booleanequals(int[] a, int[] b)Like Arrays.equals, but does not return true when both are null.private static booleanequals(int[] a, int[] b, boolean ignoreAsciiCase)Like Arrays.equals, but does not return true when both are null.private booleanforwardMatch(java.lang.String match, boolean ignoreCase, boolean resetOnTrue)Check if a forward scan will equal given match string(package private) static intisNewLine(int[] chars)Determine whether a sequence of chars begin with a CSS newline.private booleanisNextEscape()Returns true if reader next() is the start of a valid escape sequence.private static booleanmatches(int ch, com.google.common.base.CharMatcher matcher)Return true if ch matches matcher, false if not or if ch represents EOF (-1).private static booleanmatchesOrEOF(int ch, com.google.common.base.CharMatcher matcher)Return true if ch represents EOF (-1), or if it matches matcher.(package private) voidscan()private java.lang.StringtoString(java.util.List<java.lang.Integer> ints)
-
-
-
Field Detail
-
reader
private final CssReader reader
-
consumer
private final CssToken.CssTokenConsumer consumer
-
escapes
private final CssScanner.CssEscapeMemoizer escapes
-
builder
private CssToken.TokenBuilder builder
-
errHandler
private final CssErrorHandler errHandler
-
debug
private final boolean debug
- See Also:
- Constant Field Values
-
cur
private char cur
-
locale
private java.util.Locale locale
-
QNT_TOKEN_MAXLENGTH
private static final int QNT_TOKEN_MAXLENGTH
- See Also:
- Constant Field Values
-
quantities
private static final java.util.Map<int[],CssToken.Type> quantities
-
WHITESPACE
static final com.google.common.base.CharMatcher WHITESPACE
-
NOT_WHITESPACE
private static final com.google.common.base.CharMatcher NOT_WHITESPACE
-
QUOTES
static final com.google.common.base.CharMatcher QUOTES
-
U
private static final com.google.common.base.CharMatcher U
-
O
private static final com.google.common.base.CharMatcher O
-
N
private static final com.google.common.base.CharMatcher N
-
A
private static final com.google.common.base.CharMatcher A
-
NMSTART
private static final com.google.common.base.CharMatcher NMSTART
{nmstart} excluding {escape}
-
NMCHAR
private static final com.google.common.base.CharMatcher NMCHAR
{nmchar} excluding {escape}
-
QNTSTART
private static final com.google.common.base.CharMatcher QNTSTART
start of quantities that followed after {num} excluding {escape}
-
NUMEND
private static final com.google.common.base.CharMatcher NUMEND
{num} end char cannot be period
-
NUM
private static final com.google.common.base.CharMatcher NUM
{num}
-
UNARY
private static final com.google.common.base.CharMatcher UNARY
-
NUMSTART
private static final com.google.common.base.CharMatcher NUMSTART
{num} start char can be unary operators
-
HEXCHAR
static final com.google.common.base.CharMatcher HEXCHAR
-
URANGESTART
private static final com.google.common.base.CharMatcher URANGESTART
-
URANGECHAR
private static final com.google.common.base.CharMatcher URANGECHAR
-
TERMINATOR
static final com.google.common.base.CharMatcher TERMINATOR
-
CDO_LL
private static final int[] CDO_LL
The three last characters in the CDO token
-
CDC_LL
private static final int[] CDC_LL
The two last characters in the CDC token
-
URI_LL
private static final int[] URI_LL
The three last characters in the URI start token
-
ONLY_LL
private static final int[] ONLY_LL
The three last characters in the ONLY token
-
NOT_LL
private static final int[] NOT_LL
The three last characters in the NOT token
-
AND_LL
private static final int[] AND_LL
The three last characters in the AND token
-
-
Constructor Detail
-
CssScanner
private CssScanner(java.io.Reader in, java.lang.String systemID, CssErrorHandler errHandler, CssToken.CssTokenConsumer consumer, int pushbackBufferSize, java.util.Locale locale)
-
CssScanner
CssScanner(java.io.Reader in, java.lang.String systemID, CssErrorHandler errHandler, CssToken.CssTokenConsumer consumer, java.util.Locale locale)
-
-
Method Detail
-
scan
void scan() throws java.io.IOException, CssExceptions.CssException- Throws:
java.io.IOExceptionCssExceptions.CssException
-
_function
private void _function() throws java.io.IOExceptionFUNCTION {ident}\(- Throws:
java.io.IOException
-
_uri
private void _uri() throws java.io.IOException, CssExceptions.CssExceptionURI url\({w}{string}{w}\) | url\({w}([!#$%&*-\[\]-~]|{nonascii}|{escape})*{w}\)- Throws:
java.io.IOExceptionCssExceptions.CssException
-
_string
private void _string() throws java.io.IOException, CssExceptions.CssExceptionstring1 \"([^\n\r\f\\"]|\\{nl}|{escape})*\" string2 \'([^\n\r\f\\']|\\{nl}|{escape})*\'- Throws:
java.io.IOExceptionCssExceptions.CssException
-
_atkeyword
private void _atkeyword() throws java.io.IOException, CssExceptions.CssExceptionATKEYWORD '@'[-]?{nmstart}{nmchar}* nmstart [_a-z]|{nonascii}|{escape} nmchar [_a-z0-9-]|{nonascii}|{escape}- Throws:
CssExceptions.CssExceptionjava.io.IOException
-
_ident
private void _ident() throws java.io.IOException, CssExceptions.CssExceptionIDENT ([-]?{nmstart}|[--]){nmchar}*- Throws:
java.io.IOExceptionCssExceptions.CssException
-
_dashmatch
private void _dashmatch() throws java.io.IOExceptionDASHMATCH |=- Throws:
java.io.IOException
-
_includes
private void _includes() throws java.io.IOExceptionINCLUDES ~=- Throws:
java.io.IOException
-
_prefixmatch
private void _prefixmatch() throws java.io.IOExceptionPREFIXMATCH ^=- Throws:
java.io.IOException
-
_suffixmatch
private void _suffixmatch() throws java.io.IOExceptionSUFFIXMATCH $=- Throws:
java.io.IOException
-
_substringmatch
private void _substringmatch() throws java.io.IOExceptionSUBSTRINGMATCH *=- Throws:
java.io.IOException
-
_hashname
private void _hashname() throws java.io.IOException, CssExceptions.CssExceptionHASHNAME "#"{name} name {nmchar}+ [_a-z0-9-]|{nonascii}|{escape}- Throws:
CssExceptions.CssExceptionjava.io.IOException
-
_classname
private void _classname() throws java.io.IOException, CssExceptions.CssExceptionCLASSNAME "."{name} This is not part of formal lexical constructs, but seems to be safe to do at scanner level. name {nmchar}+ [_a-z0-9-]|{nonascii}|{escape}- Throws:
CssExceptions.CssExceptionjava.io.IOException
-
_important
private void _important()
IMPORTANT !{w}important
-
_comment
private void _comment() throws java.io.IOException, CssExceptions.CssExceptionBuilds a comment token, excluding the leading and trailing comment tokens.- Throws:
java.io.IOExceptionCssExceptions.CssException
-
_cdo
private void _cdo() throws java.io.IOException- Throws:
java.io.IOException
-
_num
private void _num() throws java.io.IOException, CssExceptions.CssException- Throws:
java.io.IOExceptionCssExceptions.CssException
-
_quantity
private void _quantity() throws java.io.IOException, CssExceptions.CssExceptionWith incoming builder containing a valid NUMBER, and next char being a valid QNTSTART, modify the type and append to the builder- Throws:
java.io.IOExceptionCssExceptions.CssException
-
_and
private void _and() throws java.io.IOException- Throws:
java.io.IOException
-
_not
private void _not() throws java.io.IOException- Throws:
java.io.IOException
-
_only
private void _only() throws java.io.IOException- Throws:
java.io.IOException
-
_cdc
private void _cdc() throws java.io.IOException- Throws:
java.io.IOException
-
_ws
private void _ws() throws java.io.IOExceptionWhitespace w ::= wc wc ::= #x9 | #xA | #xC | #xD | #x20- Throws:
java.io.IOException
-
_urange
private void _urange() throws java.io.IOException, CssExceptions.CssExceptionBuilds a UNICODE_RANGE token.- Throws:
java.io.IOExceptionCssExceptions.CssException
-
isNextEscape
private boolean isNextEscape() throws java.io.IOExceptionReturns true if reader next() is the start of a valid escape sequence.- Returns:
- whether or not the reader is at the start of a valid escape sequence.
- Throws:
java.io.IOException
-
append
private void append(com.google.common.base.CharMatcher matcher) throws java.io.IOException, CssExceptions.CssExceptionParse forward and append to the TokenBuilder field all characters that match matcher, or until the next character is EOF. Escapes are included verbatim if they don't match matcher, else literal.- Throws:
java.io.IOExceptionCssExceptions.CssException
-
append
private void append(com.google.common.base.CharMatcher matcher, CssToken.TokenBuilder builder) throws java.io.IOException, CssExceptions.CssExceptionParse forward and append to the supplied builder all characters that match matcher, or until the next character is EOF. Escapes are included verbatim if they don't match matcher, else literal.- Throws:
java.io.IOExceptionCssExceptions.CssException
-
forwardMatch
private boolean forwardMatch(java.lang.String match, boolean ignoreCase, boolean resetOnTrue) throws java.io.IOExceptionCheck if a forward scan will equal given match string- Parameters:
match- The string to matchignoreCase- Whether case should be ignoredresetOnTrue- Whether the reader should be reset on found match- Throws:
java.io.IOException
-
toString
private java.lang.String toString(java.util.List<java.lang.Integer> ints)
-
equals
private static boolean equals(int[] a, int[] b)Like Arrays.equals, but does not return true when both are null.
-
equals
private static boolean equals(int[] a, int[] b, boolean ignoreAsciiCase)Like Arrays.equals, but does not return true when both are null.- Parameters:
ignoreAsciiCase- If true, ascii case differences are ignored.
-
matchesOrEOF
private static boolean matchesOrEOF(int ch, com.google.common.base.CharMatcher matcher)Return true if ch represents EOF (-1), or if it matches matcher.
-
matches
private static boolean matches(int ch, com.google.common.base.CharMatcher matcher)Return true if ch matches matcher, false if not or if ch represents EOF (-1).
-
isNewLine
static int isNewLine(int[] chars)
Determine whether a sequence of chars begin with a CSS newline.- Parameters:
chars- An array with minimum two characters- Returns:
- 0 if there is no newline, else 1 or 2, representing the newline length in characters.
-
-