Class TextFormat.Tokenizer

  • Enclosing class:
    TextFormat

    private static final class TextFormat.Tokenizer
    extends java.lang.Object
    Represents a stream of tokens parsed from a String.

    The Java standard library provides many classes that you might think would be useful for implementing this, but aren't. For example:

    • java.io.StreamTokenizer: This almost does what we want -- or, at least, something that would get us close to what we want -- except for one fatal flaw: It automatically un-escapes strings using Java escape sequences, which do not include all the escape sequences we need to support (e.g. '\x').
    • java.util.Scanner: This seems like a great way at least to parse regular expressions out of a stream (so we wouldn't have to load the entire input into a single string before parsing). Sadly, Scanner requires that tokens be delimited with some delimiter. Thus, although the text "foo:" should parse to two tokens ("foo" and ":"), Scanner would recognize it only as a single token. Furthermore, Scanner provides no way to inspect the contents of delimiters, making it impossible to keep track of line and column numbers.
    • Field Detail

      • text

        private final java.lang.CharSequence text
      • currentToken

        private java.lang.String currentToken
      • pos

        private int pos
      • line

        private int line
      • column

        private int column
      • lineInfoTrackingPos

        private int lineInfoTrackingPos
      • previousLine

        private int previousLine
      • previousColumn

        private int previousColumn
      • containsSilentMarkerAfterPrevToken

        private boolean containsSilentMarkerAfterPrevToken
    • Constructor Detail

      • Tokenizer

        private Tokenizer​(java.lang.CharSequence text)
        Construct a tokenizer that parses tokens from the given text.
    • Method Detail

      • getPreviousLine

        int getPreviousLine()
      • getPreviousColumn

        int getPreviousColumn()
      • getLine

        int getLine()
      • getColumn

        int getColumn()
      • getContainsSilentMarkerAfterCurrentToken

        boolean getContainsSilentMarkerAfterCurrentToken()
      • getContainsSilentMarkerAfterPrevToken

        boolean getContainsSilentMarkerAfterPrevToken()
      • atEnd

        boolean atEnd()
        Are we at the end of the input?
      • nextToken

        void nextToken()
        Advance to the next token.
      • nextTokenInternal

        private java.lang.String nextTokenInternal()
      • isAlphaUnder

        private static boolean isAlphaUnder​(char c)
      • isDigitPlusMinus

        private static boolean isDigitPlusMinus​(char c)
      • isWhitespace

        private static boolean isWhitespace​(char c)
      • nextTokenSingleChar

        private java.lang.String nextTokenSingleChar()
        Produce a token for the single char at the current position.

        We hardcode the expected single-char tokens to avoid allocating a unique string every time, which is a GC risk. String-literals are always loaded from the class constant pool.

        This method must not be called if the current position is after the end-of-text.

      • skipWhitespace

        private void skipWhitespace()
        Skip over any whitespace so that the matcher region starts at the next token.
      • tryConsume

        boolean tryConsume​(java.lang.String token)
        If the next token exactly matches token, consume it and return true. Otherwise, return false without doing anything.
      • lookingAtInteger

        boolean lookingAtInteger()
        Returns true if the next token is an integer, but does not consume it.
      • lookingAt

        boolean lookingAt​(java.lang.String text)
        Returns true if the current token's text is equal to that specified.
      • tryConsumeIdentifier

        boolean tryConsumeIdentifier()
        If the next token is an identifier, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeInt64

        boolean tryConsumeInt64()
        If the next token is a 64-bit signed integer, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeUInt64

        public boolean tryConsumeUInt64()
        If the next token is a 64-bit unsigned integer, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeDouble

        public boolean tryConsumeDouble()
        If the next token is a double, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeFloat

        public boolean tryConsumeFloat()
        If the next token is a float, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeByteString

        boolean tryConsumeByteString()
        If the next token is a string, consume it and return true. Otherwise, return false.
      • parseExceptionPreviousToken

        TextFormat.ParseException parseExceptionPreviousToken​(java.lang.String description)
        Returns a TextFormat.ParseException with the line and column numbers of the previous token in the description, suitable for throwing.