Class XMLTokenMaker

All Implemented Interfaces:
TokenMaker

public class XMLTokenMaker extends AbstractMarkupTokenMaker
Scanner for XML. This implementation was created using JFlex 1.4.1; however, the generated file was modified for performance. Memory allocation needs to be almost completely removed to be competitive with the handwritten lexers (subclasses of AbstractTokenMaker), so this class has been modified so that Strings are never allocated (via yytext()), and the scanner never has to worry about refilling its buffer (needlessly copying chars around). We can achieve this because RText always scans exactly 1 line of tokens at a time, and hands the scanner this line as an array of characters (a Segment really). Since tokens contain pointers to char arrays instead of Strings holding their contents, there is no need for allocating new memory for Strings.

The actual algorithm generated for scanning has, of course, not been modified.

If you wish to regenerate this file yourself, keep in mind the following:

  • The generated XMLTokenMaker.java file will contain two definitions of both zzRefill and yyreset. You should hand-delete the second of each definition (the ones generated by the lexer), as these generated methods modify the input buffer, which we'll never have to do.
  • You should also change the declaration/definition of zzBuffer to NOT be initialized. This is a needless memory allocation for us since we will be pointing the array somewhere else anyway.
  • You should NOT call yylex() on the generated scanner directly; rather, you should use getTokenList as you would with any other TokenMaker instance.
Version:
0.5
Author:
Robert Futrell
  • Field Details

    • YYEOF

      public static final int YYEOF
      This character denotes the end of file
      See Also:
    • INTAG

      public static final int INTAG
      lexical states
      See Also:
    • DTD

      public static final int DTD
      See Also:
    • INATTR_DOUBLE

      public static final int INATTR_DOUBLE
      See Also:
    • YYINITIAL

      public static final int YYINITIAL
      See Also:
    • COMMENT

      public static final int COMMENT
      See Also:
    • CDATA

      public static final int CDATA
      See Also:
    • INATTR_SINGLE

      public static final int INATTR_SINGLE
      See Also:
    • PI

      public static final int PI
      See Also:
    • INTERNAL_ATTR_DOUBLE

      public static final int INTERNAL_ATTR_DOUBLE
      Type specific to XMLTokenMaker denoting a line ending with an unclosed double-quote attribute.
      See Also:
    • INTERNAL_ATTR_SINGLE

      public static final int INTERNAL_ATTR_SINGLE
      Type specific to XMLTokenMaker denoting a line ending with an unclosed single-quote attribute.
      See Also:
    • INTERNAL_INTAG

      public static final int INTERNAL_INTAG
      Token type specific to XMLTokenMaker denoting a line ending with an unclosed XML tag; thus a new line is beginning still inside of the tag.
      See Also:
    • INTERNAL_DTD

      public static final int INTERNAL_DTD
      Token type specific to XMLTokenMaker denoting a line ending with an unclosed DOCTYPE element.
      See Also:
    • INTERNAL_DTD_INTERNAL

      public static final int INTERNAL_DTD_INTERNAL
      Token type specific to XMLTokenMaker denoting a line ending with an unclosed, locally-defined DTD in a DOCTYPE element.
      See Also:
    • INTERNAL_IN_XML_COMMENT

      public static final int INTERNAL_IN_XML_COMMENT
      Token type specific to XMLTokenMaker denoting a line ending with an unclosed comment. The state to return to when this comment ends is embedded in the token type as well.
      See Also:
  • Constructor Details

    • XMLTokenMaker

      public XMLTokenMaker()
      Constructor. This must be here because JFlex does not generate a no-parameter constructor.
    • XMLTokenMaker

      public XMLTokenMaker(Reader in)
      Creates a new scanner There is also a java.io.InputStream version of this constructor.
      Parameters:
      in - the java.io.Reader to read input from.
    • XMLTokenMaker

      public XMLTokenMaker(InputStream in)
      Creates a new scanner. There is also java.io.Reader version of this constructor.
      Parameters:
      in - the java.io.Inputstream to read input from.
  • Method Details

    • addToken

      public void addToken(char[] array, int start, int end, int tokenType, int startOffset)
      Adds the token specified to the current linked list of tokens.
      Specified by:
      addToken in interface TokenMaker
      Overrides:
      addToken in class TokenMakerBase
      Parameters:
      array - The character array.
      start - The starting offset in the array.
      end - The ending offset in the array.
      tokenType - The token's type.
      startOffset - The offset in the document at which this token occurs.
    • createOccurrenceMarker

      protected OccurrenceMarker createOccurrenceMarker()
      Description copied from class: TokenMakerBase
      Returns the occurrence marker to use for this token maker. Subclasses can override to use different implementations.
      Overrides:
      createOccurrenceMarker in class TokenMakerBase
      Returns:
      The occurrence marker to use.
    • getCompleteCloseTags

      public boolean getCompleteCloseTags()
      Returns whether markup close tags should be completed. For XML, the default value is true.
      Specified by:
      getCompleteCloseTags in class AbstractMarkupTokenMaker
      Returns:
      Whether closing markup tags are completed.
      See Also:
    • getCompleteCloseMarkupTags

      public static boolean getCompleteCloseMarkupTags()
      Static version of getCompleteCloseTags(). This hack is unfortunately needed for applications to be able to query this value without instantiating this class.
      Returns:
      Whether closing markup tags are completed.
      See Also:
    • getMarkOccurrencesOfTokenType

      public boolean getMarkOccurrencesOfTokenType(int type)
      Returns Token.MARKUP_TAG_NAME.
      Specified by:
      getMarkOccurrencesOfTokenType in interface TokenMaker
      Overrides:
      getMarkOccurrencesOfTokenType in class TokenMakerBase
      Parameters:
      type - The token type.
      Returns:
      Whether tokens of this type should have "mark occurrences" enabled.
    • getTokenList

      public Token getTokenList(Segment text, int initialTokenType, int startOffset)
      Returns the first token in the linked list of tokens generated from text. This method must be implemented by subclasses so they can correctly implement syntax highlighting.
      Parameters:
      text - The text from which to get tokens.
      initialTokenType - The token type we should start with.
      startOffset - The offset into the document at which text starts.
      Returns:
      The first Token in a linked list representing the syntax highlighted text.
    • setCompleteCloseTags

      public static void setCompleteCloseTags(boolean complete)
      Sets whether markup close tags should be completed.
      Parameters:
      complete - Whether closing markup tags are completed.
      See Also:
    • yyreset

      public final void yyreset(Reader reader)
      Resets the scanner to read from a new input stream. Does not close the old reader. All internal variables are reset, the old input stream cannot be reused (internal buffer is discarded and lost). Lexical state is set to YY_INITIAL.
      Parameters:
      reader - the new input stream
    • yyclose

      public final void yyclose() throws IOException
      Closes the input stream.
      Specified by:
      yyclose in class AbstractJFlexTokenMaker
      Throws:
      IOException - If an IO error occurs.
    • yystate

      public final int yystate()
      Returns the current lexical state.
    • yybegin

      public final void yybegin(int newState)
      Enters a new lexical state
      Specified by:
      yybegin in class AbstractJFlexTokenMaker
      Parameters:
      newState - the new lexical state
    • yytext

      public final String yytext()
      Returns the text matched by the current regular expression.
      Specified by:
      yytext in class AbstractJFlexTokenMaker
    • yycharat

      public final char yycharat(int pos)
      Returns the character at position pos from the matched text. It is equivalent to yytext().charAt(pos), but faster
      Parameters:
      pos - the position of the character to fetch. A value from 0 to yylength()-1.
      Returns:
      the character at position pos
    • yylength

      public final int yylength()
      Returns the length of the matched text region.
    • yypushback

      public void yypushback(int number)
      Pushes the specified amount of characters back into the input stream. They will be read again by then next call of the scanning method
      Parameters:
      number - the number of characters to be read again. This number must not be greater than yylength()!
    • yylex

      public Token yylex() throws IOException
      Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs.
      Returns:
      the next token
      Throws:
      IOException - if any I/O-Error occurs