Class IllegalSymbolCheck

All Implemented Interfaces:
Configurable, Contextualizable

public class IllegalSymbolCheck extends AbstractCheck
Checks that specified symbols (by Unicode code points or ranges) are not used in code. By default, blocks common symbol ranges.

Rationale: This check helps prevent emoji symbols and special characters in code (commonly added by AI tools), enforce coding standards, or forbid specific Unicode characters.

Default ranges cover:

  • U+2190–U+27BF: Arrows, Mathematical Operators, Box Drawing, Geometric Shapes, Miscellaneous Symbols, and Dingbats
  • U+1F600–U+1F64F: Emoticons
  • U+1F680–U+1F6FF: Transport and Map Symbols
  • U+1F700–U+10FFFF: Alchemical Symbols and other pictographic symbols

For a complete list of Unicode characters and ranges, see: List of Unicode characters

  • Property symbolCodes - Specify the symbols to check for, as Unicode code points or ranges. Format: comma-separated list of hex codes or ranges (e.g., "0x2705, 0x1F600-0x1F64F"). To allow only ASCII characters, use "0x0080-0x10FFFF". Type is java.lang.String. Default value is "0x2190-0x27BF, 0x1F600-0x1F64F, 0x1F680-0x1F6FF, 0x1F700-0x10FFFF".
Since:
13.4.0
  • Field Details

    • MSG_KEY

      public static final String MSG_KEY
      A key is pointing to the warning message text in "messages.properties" file.
      See Also:
    • RANGE_SEPARATOR

      private static final String RANGE_SEPARATOR
      Separator used for defining ranges.
      See Also:
    • DEFAULT_ILLEGAL_CODES

      private static final String DEFAULT_ILLEGAL_CODES
      Default symbol codes to check for.
      See Also:
    • singleCodePoints

      private final Set<Integer> singleCodePoints
      Set of individual Unicode code points to disallow.
    • codePointRanges

      private final Set<IllegalSymbolCheck.CodePointRange> codePointRanges
      Set of Unicode ranges to disallow.
    • symbolCodes

      private String symbolCodes
      Specify the symbols to check for, as Unicode code points or ranges.
  • Constructor Details

    • IllegalSymbolCheck

      public IllegalSymbolCheck()
  • Method Details

    • setSymbolCodes

      public void setSymbolCodes(String symbols)
      Setter to specify the symbols to check for.
      Parameters:
      symbols - the symbols specification
      Throws:
      IllegalArgumentException - if the format is invalid
      Since:
      13.4.0
    • init

      public void init()
      Initializes the check after all properties are set.

      Ensures that the symbolCodes property is parsed for both default configuration and custom user configuration.

      Overrides:
      init in class AbstractCheck
      Throws:
      IllegalArgumentException - if the configured symbol format is invalid
    • getDefaultTokens

      public int[] getDefaultTokens()
      Description copied from class: AbstractCheck
      Returns the default token a check is interested in. Only used if the configuration for a check does not define the tokens.
      Specified by:
      getDefaultTokens in class AbstractCheck
      Returns:
      the default tokens
      See Also:
    • getAcceptableTokens

      public int[] getAcceptableTokens()
      Description copied from class: AbstractCheck
      The configurable token set. Used to protect Checks against malicious users who specify an unacceptable token set in the configuration file. The default implementation returns the check's default tokens.
      Specified by:
      getAcceptableTokens in class AbstractCheck
      Returns:
      the token set this check is designed for.
      See Also:
    • getRequiredTokens

      public int[] getRequiredTokens()
      Description copied from class: AbstractCheck
      The tokens that this check must be registered for.
      Specified by:
      getRequiredTokens in class AbstractCheck
      Returns:
      the token set this must be registered for.
      See Also:
    • isCommentNodesRequired

      public boolean isCommentNodesRequired()
      Description copied from class: AbstractCheck
      Whether comment nodes are required or not.
      Overrides:
      isCommentNodesRequired in class AbstractCheck
      Returns:
      false as a default value.
    • visitToken

      public void visitToken(DetailAST ast)
      Description copied from class: AbstractCheck
      Called to process a token.
      Overrides:
      visitToken in class AbstractCheck
      Parameters:
      ast - the token to process
    • parseSymbolCodes

      private void parseSymbolCodes()
      Parses the configured symbolCodes string into singleCodePoints and codePointRanges.
      Throws:
      IllegalArgumentException - if format is invalid
    • isIllegalSymbol

      private boolean isIllegalSymbol(int codePoint)
      Determines whether a code point is illegal.
      Parameters:
      codePoint - Unicode code point
      Returns:
      true if illegal; false otherwise
    • parseRange

      private void parseRange(String rangeStr)
      Parses and stores a Unicode range.
      Parameters:
      rangeStr - range definition string (already trimmed by caller)
      Throws:
      IllegalArgumentException - if format is invalid
    • parseCodePoint

      private static int parseCodePoint(String str)
      Parses a Unicode code point from a trimmed string. Supported formats: 0x1234, \\u1234, U+1234, or plain hex.
      Parameters:
      str - input string (already trimmed by caller)
      Returns:
      parsed code point
      Throws:
      NumberFormatException - if invalid format