Class TextFormat.Tokenizer

  • Enclosing class:
    TextFormat

    private static final class TextFormat.Tokenizer
    extends java.lang.Object
    Represents a stream of tokens parsed from a String.

    The Java standard library provides many classes that you might think would be useful for implementing this, but aren't. For example:

    • java.io.StreamTokenizer: This almost does what we want -- or, at least, something that would get us close to what we want -- except for one fatal flaw: It automatically un-escapes strings using Java escape sequences, which do not include all the escape sequences we need to support (e.g. '\x').
    • java.util.Scanner: This seems like a great way at least to parse regular expressions out of a stream (so we wouldn't have to load the entire input into a single string before parsing). Sadly, Scanner requires that tokens be delimited with some delimiter. Thus, although the text "foo:" should parse to two tokens ("foo" and ":"), Scanner would recognize it only as a single token. Furthermore, Scanner provides no way to inspect the contents of delimiters, making it impossible to keep track of line and column numbers.
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      private Tokenizer​(java.lang.CharSequence text)
      Construct a tokenizer that parses tokens from the given text.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) boolean atEnd()
      Are we at the end of the input?
      (package private) void consume​(java.lang.String token)
      If the next token exactly matches token, consume it.
      boolean consumeBoolean()
      If the next token is a boolean, consume it and return its value.
      (package private) ByteString consumeByteString()
      If the next token is a string, consume it, unescape it as a ByteString, and return it.
      private void consumeByteString​(java.util.List<ByteString> list)
      Like consumeByteString() but adds each token of the string to the given list.
      double consumeDouble()
      If the next token is a double, consume it and return its value.
      float consumeFloat()
      If the next token is a float, consume it and return its value.
      (package private) java.lang.String consumeIdentifier()
      If the next token is an identifier, consume it and return its value.
      (package private) int consumeInt32()
      If the next token is a 32-bit signed integer, consume it and return its value.
      (package private) long consumeInt64()
      If the next token is a 64-bit signed integer, consume it and return its value.
      java.lang.String consumeString()
      If the next token is a string, consume it and return its (unescaped) value.
      (package private) int consumeUInt32()
      If the next token is a 32-bit unsigned integer, consume it and return its value.
      (package private) long consumeUInt64()
      If the next token is a 64-bit unsigned integer, consume it and return its value.
      private TextFormat.ParseException floatParseException​(java.lang.NumberFormatException e)
      Constructs an appropriate TextFormat.ParseException for the given NumberFormatException when trying to parse a float or double.
      (package private) int getColumn()  
      (package private) int getLine()  
      (package private) int getPreviousColumn()  
      (package private) int getPreviousLine()  
      private TextFormat.ParseException integerParseException​(java.lang.NumberFormatException e)
      Constructs an appropriate TextFormat.ParseException for the given NumberFormatException when trying to parse an integer.
      private static boolean isAlphaUnder​(char c)  
      private static boolean isDigitPlusMinus​(char c)  
      private static boolean isWhitespace​(char c)  
      (package private) boolean lookingAt​(java.lang.String text)
      Returns true if the current token's text is equal to that specified.
      (package private) boolean lookingAtInteger()
      Returns true if the next token is an integer, but does not consume it.
      (package private) void nextToken()
      Advance to the next token.
      private java.lang.String nextTokenInternal()  
      private java.lang.String nextTokenSingleChar()
      Produce a token for the single char at the current position.
      (package private) TextFormat.ParseException parseException​(java.lang.String description)
      Returns a TextFormat.ParseException with the current line and column numbers in the description, suitable for throwing.
      (package private) TextFormat.ParseException parseExceptionPreviousToken​(java.lang.String description)
      Returns a TextFormat.ParseException with the line and column numbers of the previous token in the description, suitable for throwing.
      private void skipWhitespace()
      Skip over any whitespace so that the matcher region starts at the next token.
      (package private) boolean tryConsume​(java.lang.String token)
      If the next token exactly matches token, consume it and return true.
      (package private) boolean tryConsumeByteString()
      If the next token is a string, consume it and return true.
      boolean tryConsumeDouble()
      If the next token is a double, consume it and return true.
      boolean tryConsumeFloat()
      If the next token is a float, consume it and return true.
      (package private) boolean tryConsumeIdentifier()
      If the next token is an identifier, consume it and return true.
      (package private) boolean tryConsumeInt64()
      If the next token is a 64-bit signed integer, consume it and return true.
      boolean tryConsumeUInt64()
      If the next token is a 64-bit unsigned integer, consume it and return true.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • text

        private final java.lang.CharSequence text
      • currentToken

        private java.lang.String currentToken
      • pos

        private int pos
      • line

        private int line
      • column

        private int column
      • lineInfoTrackingPos

        private int lineInfoTrackingPos
      • previousLine

        private int previousLine
      • previousColumn

        private int previousColumn
    • Constructor Detail

      • Tokenizer

        private Tokenizer​(java.lang.CharSequence text)
        Construct a tokenizer that parses tokens from the given text.
    • Method Detail

      • getPreviousLine

        int getPreviousLine()
      • getPreviousColumn

        int getPreviousColumn()
      • getLine

        int getLine()
      • getColumn

        int getColumn()
      • atEnd

        boolean atEnd()
        Are we at the end of the input?
      • nextToken

        void nextToken()
        Advance to the next token.
      • nextTokenInternal

        private java.lang.String nextTokenInternal()
      • isAlphaUnder

        private static boolean isAlphaUnder​(char c)
      • isDigitPlusMinus

        private static boolean isDigitPlusMinus​(char c)
      • isWhitespace

        private static boolean isWhitespace​(char c)
      • nextTokenSingleChar

        private java.lang.String nextTokenSingleChar()
        Produce a token for the single char at the current position.

        We hardcode the expected single-char tokens to avoid allocating a unique string every time, which is a GC risk. String-literals are always loaded from the class constant pool.

        This method must not be called if the current position is after the end-of-text.

      • skipWhitespace

        private void skipWhitespace()
        Skip over any whitespace so that the matcher region starts at the next token.
      • tryConsume

        boolean tryConsume​(java.lang.String token)
        If the next token exactly matches token, consume it and return true. Otherwise, return false without doing anything.
      • lookingAtInteger

        boolean lookingAtInteger()
        Returns true if the next token is an integer, but does not consume it.
      • lookingAt

        boolean lookingAt​(java.lang.String text)
        Returns true if the current token's text is equal to that specified.
      • tryConsumeIdentifier

        boolean tryConsumeIdentifier()
        If the next token is an identifier, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeInt64

        boolean tryConsumeInt64()
        If the next token is a 64-bit signed integer, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeUInt64

        public boolean tryConsumeUInt64()
        If the next token is a 64-bit unsigned integer, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeDouble

        public boolean tryConsumeDouble()
        If the next token is a double, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeFloat

        public boolean tryConsumeFloat()
        If the next token is a float, consume it and return true. Otherwise, return false without doing anything.
      • tryConsumeByteString

        boolean tryConsumeByteString()
        If the next token is a string, consume it and return true. Otherwise, return false.
      • parseExceptionPreviousToken

        TextFormat.ParseException parseExceptionPreviousToken​(java.lang.String description)
        Returns a TextFormat.ParseException with the line and column numbers of the previous token in the description, suitable for throwing.