Class Parser<T>
- java.lang.Object
-
- org.jparsec.Parser<T>
-
- Direct Known Subclasses:
BestParser,DelimitedParser,EmptyListParser,NestableBlockCommentScanner,ReluctantBetweenParser,RepeatAtLeastParser,RepeatTimesParser,SkipAtLeastParser,SkipTimesParser
public abstract class Parser<T> extends java.lang.ObjectDefines grammar and encapsulates parsing logic. AParsertakes as input aCharSequencesource and parses it when theparse(CharSequence)method is called. A value of typeTwill be returned if parsing succeeds, or aParserExceptionis thrown to indicate parsing error. For example:Parser<String> scanner = Scanners.IDENTIFIER; assertEquals("foo", scanner.parse("foo"));Parsers run either on character level to scan the source, or on token level to parse a list ofTokenobjects returned from another parser. This other parser that returns the list of tokens for token level parsing is hooked up via thefrom(Parser, Parser)orfrom(Parser)method.The following are important naming conventions used throughout the library:
- A character level parser object that recognizes a single lexical word is called a scanner.
- A scanner that translates the recognized lexical word into a token is called a tokenizer.
- A character level parser object that does lexical analysis and returns a list of
Tokenis called a lexer. - All
indexparameters are 0-based indexes in the original source.
Parser.Mode.DEBUGmode toparse(CharSequence, Mode)and inspect the result inParserException.getParseTree(). Alllabeledparsers will generate a node in the exception's parse tree, with matched indices in the source.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classParser.ModeDefines the mode that a parser should be run in.static classParser.Reference<T>An atomic mutable reference toParserused in recursive grammars.private static classParser.Rhs<T>
-
Constructor Summary
Constructors Constructor Description Parser()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description (package private) abstract booleanapply(ParseContext ctxt)private static <T> TapplyInfixOperators(T initialValue, java.util.List<? extends java.util.function.Function<? super T,? extends T>> functions)private static <T> TapplyInfixrOperators(T first, java.util.List<Parser.Rhs<T>> rhss)private static <T> TapplyPostfixOperators(T a, java.lang.Iterable<? extends java.util.function.Function<? super T,? extends T>> ms)private static <T> TapplyPrefixOperators(java.util.List<? extends java.util.function.Function<? super T,? extends T>> ms, T a)(package private) Parser<T>asDelimiter()As a delimiter, the parser's error is considered lenient and will only be reported if no other meaningful error is encountered.Parser<java.util.Optional<T>>asOptional()p.asOptional()is equivalent top?in EBNF.Parser<java.util.List<T>>atLeast(int min)Parser<T>atomic()AParserthat undoes any partial match ifthisfails.Parser<T>between(Parser<?> before, Parser<?> after)<R> Parser<R>cast()Parser<java.util.List<T>>endBy(Parser<?> delim)Parser<java.util.List<T>>endBy1(Parser<?> delim)Parser<java.lang.Boolean>fails()Parser<T>followedBy(Parser<?> parser)Parser<T>from(Parser<?> tokenizer, Parser<java.lang.Void> delim)AParserthat takes as input the tokens returned bytokenizerdelimited bydelim, and runsthisto parse the tokens.Parser<T>from(Parser<? extends java.util.Collection<Token>> lexer)(package private) TgetReturn(ParseContext ctxt)<R> Parser<R>ifelse(java.util.function.Function<? super T,? extends Parser<? extends R>> consequence, Parser<? extends R> alternative)<R> Parser<R>ifelse(Parser<? extends R> consequence, Parser<? extends R> alternative)Parser<T>infixl(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> operator)AParserfor left-associative infix operator.Parser<T>infixn(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> op)AParserthat parses non-associative infix operator.Parser<T>infixr(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> op)AParserfor right-associative infix operator.Parser<T>label(java.lang.String name)Parser<java.util.List<Token>>lexer(Parser<?> delim)AParserthat greedily runsthisrepeatedly, and ignores the pattern recognized bydelimbefore and after each occurrence.Parser<java.util.List<T>>many()p.many()is equivalent top*in EBNF.Parser<java.util.List<T>>many1()p.many1()is equivalent top+in EBNF.<R> Parser<R>map(java.util.function.Function<? super T,? extends R> map)static <T> Parser.Reference<T>newReference()Creates a new instance ofParser.Reference.<To> Parser<To>next(java.util.function.Function<? super T,? extends Parser<? extends To>> map)AParserthat executesthis, maps the result usingmapto anotherParserobject to be executed as the next step.<R> Parser<R>next(Parser<R> parser)Parser<?>not()AParserthat fails ifthissucceeds.Parser<?>not(java.lang.String unexpected)AParserthat fails ifthissucceeds.Parser<T>notFollowedBy(Parser<?> parser)Parser<T>optional()Deprecated.since 3.0.Parser<T>optional(T defaultValue)Parser<T>or(Parser<? extends T> alternative)p1.or(p2)is equivalent top1 | p2in EBNF.Parser<T>otherwise(Parser<? extends T> fallback)a.otherwise(fallback)runsfallbackwhenamatches zero input.Tparse(java.lang.CharSequence source)Parsessource.Tparse(java.lang.CharSequence source, java.lang.String moduleName)Deprecated.Please useparse(CharSequence)instead.Tparse(java.lang.CharSequence source, Parser.Mode mode)Parsessourceunder the givenmode.Tparse(java.lang.Readable readable)Parses source read fromreadable.Tparse(java.lang.Readable readable, java.lang.String moduleName)Deprecated.Please useparse(Readable)instead.ParseTreeparseTree(java.lang.CharSequence source)Parsessourceand returns aParseTreecorresponding to the syntactical structure of the input.Parser<T>peek()AParserthat runsthisand undoes any input consumption if succeeds.Parser<T>postfix(Parser<? extends java.util.function.Function<? super T,? extends T>> op)Parser<T>prefix(Parser<? extends java.util.function.Function<? super T,? extends T>> op)(package private) static java.lang.StringBuilderread(java.lang.Readable from)Copies all content fromfromtoto.Parser<T>reluctantBetween(Parser<?> before, Parser<?> after)Deprecated.This method probably only works in the simplest cases.<R> Parser<R>retn(R value)Parser<java.util.List<T>>sepBy(Parser<?> delim)Parser<java.util.List<T>>sepBy1(Parser<?> delim)Parser<java.util.List<T>>sepEndBy(Parser<?> delim)Parser<java.util.List<T>>sepEndBy1(Parser<?> delim)Parser<java.lang.Void>skipAtLeast(int min)Parser<java.lang.Void>skipMany()p.skipMany()is equivalent top*in EBNF.Parser<java.lang.Void>skipMany1()p.skipMany1()is equivalent top+in EBNF.Parser<java.lang.Void>skipTimes(int n)Parser<java.lang.Void>skipTimes(int min, int max)AParserthat runsthisparser for at leastmintimes and up tomaxtimes, with all the return values ignored.Parser<java.lang.String>source()AParserthat returns the matched string in the original source.Parser<java.lang.Boolean>succeeds()Parser<java.util.List<T>>times(int n)Parser<java.util.List<T>>times(int min, int max)Parser<Token>token()Parser<java.util.List<T>>until(Parser<?> parser)AParserthat matches this parser zero or many times until the given parser succeeds.Parser<WithSource<T>>withSource()AParserthat returns both parsed object and matched string.
-
-
-
Method Detail
-
newReference
public static <T> Parser.Reference<T> newReference()
Creates a new instance ofParser.Reference. Used when your grammar is recursive (many grammars are).
-
retn
public final <R> Parser<R> retn(R value)
-
next
public final <To> Parser<To> next(java.util.function.Function<? super T,? extends Parser<? extends To>> map)
AParserthat executesthis, maps the result usingmapto anotherParserobject to be executed as the next step.
-
until
public final Parser<java.util.List<T>> until(Parser<?> parser)
AParserthat matches this parser zero or many times until the given parser succeeds. The input that matches the given parser will not be consumed. The input that matches this parser will be collected in a list that will be returned by this function.- Since:
- 2.2
-
many
public final Parser<java.util.List<T>> many()
p.many()is equivalent top*in EBNF. The return values are collected and returned in aList.
-
skipMany
public final Parser<java.lang.Void> skipMany()
p.skipMany()is equivalent top*in EBNF. The return values are discarded.
-
many1
public final Parser<java.util.List<T>> many1()
p.many1()is equivalent top+in EBNF. The return values are collected and returned in aList.
-
skipMany1
public final Parser<java.lang.Void> skipMany1()
p.skipMany1()is equivalent top+in EBNF. The return values are discarded.
-
atLeast
public final Parser<java.util.List<T>> atLeast(int min)
AParserthat runsthisparser greedily for at leastmintimes. The return values are collected and returned in aList.
-
skipAtLeast
public final Parser<java.lang.Void> skipAtLeast(int min)
-
skipTimes
public final Parser<java.lang.Void> skipTimes(int n)
-
times
public final Parser<java.util.List<T>> times(int min, int max)
AParserthat runsthisparser for at leastmintimes and up tomaxtimes. The return values are collected and returned inList.
-
skipTimes
public final Parser<java.lang.Void> skipTimes(int min, int max)
AParserthat runsthisparser for at leastmintimes and up tomaxtimes, with all the return values ignored.
-
or
public final Parser<T> or(Parser<? extends T> alternative)
p1.or(p2)is equivalent top1 | p2in EBNF.- Parameters:
alternative- the alternative parser to run if this fails.
-
otherwise
public final Parser<T> otherwise(Parser<? extends T> fallback)
a.otherwise(fallback)runsfallbackwhenamatches zero input. This is different froma.or(alternative)wherealternativeis run wheneverafails to match.One should usually use
or(org.jparsec.Parser<? extends T>).- Parameters:
fallback- the parser to run ifthismatches no input.- Since:
- 3.1
-
optional
@Deprecated public final Parser<T> optional()
Deprecated.since 3.0. Use {@link #optional(null)} orasOptional()instead.p.optional()is equivalent top?in EBNF.nullis the result whenthisfails with no partial match.
-
asOptional
public final Parser<java.util.Optional<T>> asOptional()
p.asOptional()is equivalent top?in EBNF.Optional.empty()is the result whenthisfails with no partial match. Note thatOptionalprohibits nulls so make surethisdoes not result innull.- Since:
- 3.0
-
not
public final Parser<?> not()
AParserthat fails ifthissucceeds. Any input consumption is undone.
-
not
public final Parser<?> not(java.lang.String unexpected)
AParserthat fails ifthissucceeds. Any input consumption is undone.- Parameters:
unexpected- the name of what we don't expect.
-
peek
public final Parser<T> peek()
AParserthat runsthisand undoes any input consumption if succeeds.
-
atomic
public final Parser<T> atomic()
AParserthat undoes any partial match ifthisfails. In other words, the parser either fully matches, or matches none.
-
succeeds
public final Parser<java.lang.Boolean> succeeds()
-
fails
public final Parser<java.lang.Boolean> fails()
-
ifelse
public final <R> Parser<R> ifelse(Parser<? extends R> consequence, Parser<? extends R> alternative)
-
ifelse
public final <R> Parser<R> ifelse(java.util.function.Function<? super T,? extends Parser<? extends R>> consequence, Parser<? extends R> alternative)
-
cast
public final <R> Parser<R> cast()
Caststhisto aParserof typeR. Use it only if you know the parser actually returns value of typeR.
-
between
public final Parser<T> between(Parser<?> before, Parser<?> after)
AParserthat runsthisbetweenbeforeandafter. The return value ofthisis preserved.Equivalent to
Parsers.between(Parser, Parser, Parser), which preserves the natural order of the parsers in the argument list, but is a bit more verbose.
-
reluctantBetween
@Deprecated public final Parser<T> reluctantBetween(Parser<?> before, Parser<?> after)
Deprecated.This method probably only works in the simplest cases. And it's a character-level parser only. Use it at your own risk. It may be deleted later when we find a better way.AParserthat first runsbeforefrom the input start, then runsafterfrom the input's end, and only then runsthison what's left from the input. In effect,thisbehaves reluctantly, givingaftera chance to grab input that would have been consumed bythisotherwise.
-
sepBy1
public final Parser<java.util.List<T>> sepBy1(Parser<?> delim)
AParserthat runsthis1 or more times separated bydelim.The return values are collected in a
List.
-
sepBy
public final Parser<java.util.List<T>> sepBy(Parser<?> delim)
AParserthat runsthis0 or more times separated bydelim.The return values are collected in a
List.
-
endBy
public final Parser<java.util.List<T>> endBy(Parser<?> delim)
AParserthat runsthisfor 0 or more times delimited and terminated bydelim.The return values are collected in a
List.
-
endBy1
public final Parser<java.util.List<T>> endBy1(Parser<?> delim)
AParserthat runsthisfor 1 or more times delimited and terminated bydelim.The return values are collected in a
List.
-
sepEndBy1
public final Parser<java.util.List<T>> sepEndBy1(Parser<?> delim)
AParserthat runsthisfor 1 ore more times separated and optionally terminated bydelim. For example:"foo;foo;foo"and"foo;foo;"both matchesfoo.sepEndBy1(semicolon).The return values are collected in a
List.
-
sepEndBy
public final Parser<java.util.List<T>> sepEndBy(Parser<?> delim)
AParserthat runsthisfor 0 ore more times separated and optionally terminated bydelim. For example:"foo;foo;foo"and"foo;foo;"both matchesfoo.sepEndBy(semicolon).The return values are collected in a
List.
-
prefix
public final Parser<T> prefix(Parser<? extends java.util.function.Function<? super T,? extends T>> op)
AParserthat runsopfor 0 or more times greedily, then runsthis. TheFunctionobjects returned fromopare applied from right to left to the return value ofp.p.prefix(op)is equivalent toop* pin EBNF.
-
postfix
public final Parser<T> postfix(Parser<? extends java.util.function.Function<? super T,? extends T>> op)
AParserthat runsthisand then runsopfor 0 or more times greedily. TheFunctionobjects returned fromopare applied from left to right to the return value of p.This is the preferred API to avoid
StackOverflowErrorin left-recursive parsers. For example, to parse array types in the form of "T[]" or "T[][]", the following left recursive grammar will fail:
A correct implementation is:Terminals terms = Terminals.operators("[", "]"); Parser.Reference<Type> ref = Parser.newReference(); ref.set(Parsers.or(leafTypeParser, Parsers.sequence(ref.lazy(), terms.phrase("[", "]"), new Unary<Type>() {...}))); return ref.get();
A not-so-obvious example, is to parse theTerminals terms = Terminals.operators("[", "]"); return leafTypeParer.postfix(terms.phrase("[", "]").retn(new Unary<Type>() {...}));expr ? a : bternary operator. It too is a left recursive grammar. And un-intuitively it can also be thought as a postfix operator. Basically, we can parse "? a : b" as a whole into a unary operator that accepts the condition expression as input and outputs the full ternary expression:Parser<Expr> ternary(Parser<Expr> expr) { return expr.postfix( Parsers.sequence( terms.token("?"), expr, terms.token(":"), expr, (unused, then, unused, orelse) -> cond -> new TernaryExpr(cond, then, orelse))); }OperatorTablealso handles left recursion transparently.p.postfix(op)is equivalent top op*in EBNF.
-
infixn
public final Parser<T> infixn(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> op)
AParserthat parses non-associative infix operator. Runsthisfor the left operand, and then runsopandthisfor the operator and the right operand optionally. TheBiFunctionobjects returned fromopare applied to the return values of the two operands, if any.p.infixn(op)is equivalent top (op p)?in EBNF.
-
infixl
public final Parser<T> infixl(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> operator)
AParserfor left-associative infix operator. Runsthisfor the left operand, and then runsoperatorandthisfor the operator and the right operand for 0 or more times greedily. TheBiFunctionobjects returned fromoperatorare applied from left to right to the return values ofthis, if any. For example:a + b + c + dis evaluated as(((a + b)+c)+d).p.infixl(op)is equivalent top (op p)*in EBNF.
-
infixr
public final Parser<T> infixr(Parser<? extends java.util.function.BiFunction<? super T,? super T,? extends T>> op)
AParserfor right-associative infix operator. Runsthisfor the left operand, and then runsopandthisfor the operator and the right operand for 0 or more times greedily. TheBiFunctionobjects returned fromopare applied from right to left to the return values ofthis, if any. For example:a + b + c + dis evaluated asa + (b + (c + d)).p.infixr(op)is equivalent top (op p)*in EBNF.
-
token
public final Parser<Token> token()
AParserthat runsthisand wraps the return value in aToken.It is normally not necessary to call this method explicitly.
lexer(Parser)andfrom(Parser, Parser)both do the conversion automatically.
-
source
public final Parser<java.lang.String> source()
AParserthat returns the matched string in the original source.
-
withSource
public final Parser<WithSource<T>> withSource()
AParserthat returns both parsed object and matched string.
-
from
public final Parser<T> from(Parser<? extends java.util.Collection<Token>> lexer)
AParserthat takes as input theTokencollection returned bylexer, and runsthisto parse the tokens. Most parsers should use the simplerfrom(Parser, Parser)instead.thismust be a token level parser.
-
from
public final Parser<T> from(Parser<?> tokenizer, Parser<java.lang.Void> delim)
AParserthat takes as input the tokens returned bytokenizerdelimited bydelim, and runsthisto parse the tokens. A common misunderstanding is thattokenizerhas to be a parser ofToken. It doesn't need to be becauseTerminalsalready takes care of wrapping your logical token objects into physicalTokenwith correct source location information tacked on for free. Your token object can literally be anything, as long as your token level parser can recognize it later.The following example uses
Terminals.tokenizer():Terminals terminals = ...; return parser.from(terminals.tokenizer(), Scanners.WHITESPACES.optional()).parse(str);
And tokens are optionally delimited by whitespaces.Optionally, you can skip comments using an alternative scanner than
WHITESPACES:Terminals terminals = ...; Parser<?> delim = Parsers.or( Scanners.WHITESPACE, Scanners.JAVA_LINE_COMMENT, Scanners.JAVA_BLOCK_COMMENT).skipMany(); return parser.from(terminals.tokenizer(), delim).parse(str);In both examples, it's important to make sure the delimiter scanner can accept empty string (either through
optional()orskipMany()), unless adjacent operator characters shouldn't be parsed as separate operators. i.e. "((" as two left parenthesis operators.thismust be a token level parser.
-
lexer
public Parser<java.util.List<Token>> lexer(Parser<?> delim)
AParserthat greedily runsthisrepeatedly, and ignores the pattern recognized bydelimbefore and after each occurrence. The result tokens are wrapped inTokenand are collected and returned in aList.It is normally not necessary to call this method explicitly.
from(Parser, Parser)is more convenient for simple uses that just need to connect a token level parser with a lexer that produces the tokens. When more flexible control over the token list is needed, for example, to parse indentation sensitive language, a pre-processor of the token list may be needed.thismust be a tokenizer that returns a token value.
-
asDelimiter
final Parser<T> asDelimiter()
As a delimiter, the parser's error is considered lenient and will only be reported if no other meaningful error is encountered. The delimiter's logical step is also considered 0, which means it won't ever stop repetition combinators such asmany().
-
parse
public final T parse(java.lang.CharSequence source)
Parsessource.
-
parse
public final T parse(java.lang.Readable readable) throws java.io.IOException
Parses source read fromreadable.- Throws:
java.io.IOException
-
parse
public final T parse(java.lang.CharSequence source, Parser.Mode mode)
Parsessourceunder the givenmode. For example:try { parser.parse(text, Mode.DEBUG); } catch (ParserException e) { ParseTree parseTree = e.getParseTree(); ... }- Since:
- 2.3
-
parseTree
public final ParseTree parseTree(java.lang.CharSequence source)
Parsessourceand returns aParseTreecorresponding to the syntactical structure of the input. Onlylabeledparser nodes are represented in the parse tree.If parsing failed,
ParserException.getParseTree()can be inspected for the parse tree at error location.- Since:
- 2.3
-
parse
@Deprecated public final T parse(java.lang.CharSequence source, java.lang.String moduleName)
Deprecated.Please useparse(CharSequence)instead.Parsessource.- Parameters:
source- the source stringmoduleName- the name of the module, this name appears in error message- Returns:
- the result
-
parse
@Deprecated public final T parse(java.lang.Readable readable, java.lang.String moduleName) throws java.io.IOException
Deprecated.Please useparse(Readable)instead.Parses source read fromreadable.- Parameters:
readable- where the source is read frommoduleName- the name of the module, this name appears in error message- Returns:
- the result
- Throws:
java.io.IOException
-
apply
abstract boolean apply(ParseContext ctxt)
-
read
static java.lang.StringBuilder read(java.lang.Readable from) throws java.io.IOExceptionCopies all content fromfromtoto.- Throws:
java.io.IOException
-
getReturn
final T getReturn(ParseContext ctxt)
-
applyPrefixOperators
private static <T> T applyPrefixOperators(java.util.List<? extends java.util.function.Function<? super T,? extends T>> ms, T a)
-
applyPostfixOperators
private static <T> T applyPostfixOperators(T a, java.lang.Iterable<? extends java.util.function.Function<? super T,? extends T>> ms)
-
applyInfixOperators
private static <T> T applyInfixOperators(T initialValue, java.util.List<? extends java.util.function.Function<? super T,? extends T>> functions)
-
applyInfixrOperators
private static <T> T applyInfixrOperators(T first, java.util.List<Parser.Rhs<T>> rhss)
-
-