-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement Pratt parsing #614
Draft
39555
wants to merge
26
commits into
winnow-rs:main
Choose a base branch
from
39555:pratt
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
fed8c90
feat: implement Pratt parser
39555 ee4459d
commit suggestion
39555 4b1499d
remove spaces from #[doc(alias = "...")]
39555 acf4577
remove `UnaryOp` and `BinaryOp` in favor of `Fn`
39555 a816a1c
remove redundant trait impl
39555 2a80e65
remove `allow_unused`, move `allow(non_snake_case)` to where it shoul…
39555 29fe18d
stop dumping pratt into `combinator` namespace
39555 5a4f4b4
move important things to go first
39555 919a1cb
strip fancy api for now
39555 0273a29
remove wrong and long doc for now
39555 f218911
fix: precedence for associativity, remove `trace()`
39555 3d7ef41
switch from `&dyn Fn(O) -> O` to `fn(O) -> O`
39555 a6cbc1a
feat: pass Input into operator closures
39555 29b64fa
add `trace` for `tests` parser
39555 b31a3a3
feat: operator closures must return PResult
39555 33c82f3
feat: allow the user to specify starting power
39555 040dd85
feat: enum `Assoc` for infix operators. Add `Neither` associativity
39555 6d88dff
fix: switch to i64, fix precedence checking
39555 161f9da
fix: remove 'static constraint from `Operand`
39555 d53a32e
refactor: rename to `expression.rs`
39555 4f690db
refactor: rename to `fn expression`
39555 44546f2
feat: new api
39555 a583d24
fix: MSRV
39555 431b6f6
fix: clippy
39555 482a162
style: no need to specify the input type in the `fold` closure
39555 81ba185
feat: fn current_precedence_level(self, level: i64)
39555 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,349 @@ | ||
use core::marker::PhantomData; | ||
|
||
use crate::{ | ||
combinator::{opt, trace}, | ||
error::{ErrMode, ParserError}, | ||
stream::{Stream, StreamIsPartial}, | ||
PResult, Parser, | ||
}; | ||
|
||
use super::{empty, fail}; | ||
|
||
/// Parses an expression based on operator precedence. | ||
#[doc(alias = "pratt")] | ||
#[doc(alias = "separated")] | ||
#[doc(alias = "shunting_yard")] | ||
#[doc(alias = "precedence_climbing")] | ||
#[inline(always)] | ||
pub fn expression<I, ParseOperand, O, E>( | ||
parse_operand: ParseOperand, | ||
) -> Expression< | ||
I, | ||
O, | ||
ParseOperand, | ||
impl Parser<I, Prefix<I, O, E>, E>, | ||
impl Parser<I, Postfix<I, O, E>, E>, | ||
impl Parser<I, Infix<I, O, E>, E>, | ||
E, | ||
> | ||
where | ||
I: Stream + StreamIsPartial, | ||
ParseOperand: Parser<I, O, E>, | ||
E: ParserError<I>, | ||
{ | ||
Expression { | ||
precedence_level: 0, | ||
parse_operand, | ||
parse_prefix: fail, | ||
parse_postfix: fail, | ||
parse_infix: fail, | ||
i: Default::default(), | ||
o: Default::default(), | ||
e: Default::default(), | ||
} | ||
} | ||
|
||
pub struct Expression<I, O, ParseOperand, Pre, Post, Pix, E> | ||
where | ||
I: Stream + StreamIsPartial, | ||
ParseOperand: Parser<I, O, E>, | ||
E: ParserError<I>, | ||
{ | ||
precedence_level: i64, | ||
parse_operand: ParseOperand, | ||
parse_prefix: Pre, | ||
parse_postfix: Post, | ||
parse_infix: Pix, | ||
i: PhantomData<I>, | ||
o: PhantomData<O>, | ||
e: PhantomData<E>, | ||
} | ||
|
||
impl<I, O, ParseOperand, Pre, Post, Pix, E> Expression<I, O, ParseOperand, Pre, Post, Pix, E> | ||
where | ||
ParseOperand: Parser<I, O, E>, | ||
I: Stream + StreamIsPartial, | ||
E: ParserError<I>, | ||
{ | ||
#[inline(always)] | ||
pub fn prefix<NewParsePrefix>( | ||
self, | ||
parser: NewParsePrefix, | ||
) -> Expression<I, O, ParseOperand, NewParsePrefix, Post, Pix, E> | ||
where | ||
NewParsePrefix: Parser<I, Prefix<I, O, E>, E>, | ||
{ | ||
Expression { | ||
precedence_level: self.precedence_level, | ||
parse_operand: self.parse_operand, | ||
parse_prefix: parser, | ||
parse_postfix: self.parse_postfix, | ||
parse_infix: self.parse_infix, | ||
i: Default::default(), | ||
o: Default::default(), | ||
e: Default::default(), | ||
} | ||
} | ||
|
||
#[inline(always)] | ||
pub fn postfix<NewParsePostfix>( | ||
self, | ||
parser: NewParsePostfix, | ||
) -> Expression<I, O, ParseOperand, Pre, NewParsePostfix, Pix, E> | ||
where | ||
NewParsePostfix: Parser<I, Postfix<I, O, E>, E>, | ||
{ | ||
Expression { | ||
precedence_level: self.precedence_level, | ||
parse_operand: self.parse_operand, | ||
parse_prefix: self.parse_prefix, | ||
parse_postfix: parser, | ||
parse_infix: self.parse_infix, | ||
i: Default::default(), | ||
o: Default::default(), | ||
e: Default::default(), | ||
} | ||
} | ||
|
||
#[inline(always)] | ||
pub fn infix<NewParseInfix>( | ||
self, | ||
parser: NewParseInfix, | ||
) -> Expression<I, O, ParseOperand, Pre, Post, NewParseInfix, E> | ||
where | ||
NewParseInfix: Parser<I, Infix<I, O, E>, E>, | ||
{ | ||
Expression { | ||
precedence_level: self.precedence_level, | ||
parse_operand: self.parse_operand, | ||
parse_prefix: self.parse_prefix, | ||
parse_postfix: self.parse_postfix, | ||
parse_infix: parser, | ||
i: Default::default(), | ||
o: Default::default(), | ||
e: Default::default(), | ||
} | ||
} | ||
|
||
#[inline(always)] | ||
pub fn current_precedence_level( | ||
mut self, | ||
level: i64, | ||
) -> Expression<I, O, ParseOperand, Pre, Post, Pix, E> { | ||
self.precedence_level = level; | ||
self | ||
} | ||
} | ||
|
||
impl<I, O, Pop, Pre, Post, Pix, E> Parser<I, O, E> for Expression<I, O, Pop, Pre, Post, Pix, E> | ||
where | ||
I: Stream + StreamIsPartial, | ||
Pop: Parser<I, O, E>, | ||
Pix: Parser<I, Infix<I, O, E>, E>, | ||
Pre: Parser<I, Prefix<I, O, E>, E>, | ||
Post: Parser<I, Postfix<I, O, E>, E>, | ||
E: ParserError<I>, | ||
{ | ||
#[inline(always)] | ||
fn parse_next(&mut self, input: &mut I) -> PResult<O, E> { | ||
trace("expression", move |i: &mut I| { | ||
expression_impl( | ||
i, | ||
&mut self.parse_operand, | ||
&mut self.parse_prefix, | ||
&mut self.parse_postfix, | ||
&mut self.parse_infix, | ||
self.precedence_level, | ||
) | ||
}) | ||
.parse_next(input) | ||
} | ||
} | ||
|
||
fn expression_impl<I, O, Pop, Pre, Post, Pix, E>( | ||
i: &mut I, | ||
parse_operand: &mut Pop, | ||
prefix: &mut Pre, | ||
postfix: &mut Post, | ||
infix: &mut Pix, | ||
min_power: i64, | ||
) -> PResult<O, E> | ||
where | ||
I: Stream + StreamIsPartial, | ||
Pop: Parser<I, O, E>, | ||
Pix: Parser<I, Infix<I, O, E>, E>, | ||
Pre: Parser<I, Prefix<I, O, E>, E>, | ||
Post: Parser<I, Postfix<I, O, E>, E>, | ||
E: ParserError<I>, | ||
{ | ||
let operand = opt(trace("operand", parse_operand.by_ref())).parse_next(i)?; | ||
let mut operand = if let Some(operand) = operand { | ||
operand | ||
} else { | ||
// Prefix unary operators | ||
let len = i.eof_offset(); | ||
let Prefix(power, fold_prefix) = trace("prefix", prefix.by_ref()).parse_next(i)?; | ||
// infinite loop check: the parser must always consume | ||
if i.eof_offset() == len { | ||
return Err(ErrMode::assert(i, "`prefix` parsers must always consume")); | ||
} | ||
let operand = expression_impl(i, parse_operand, prefix, postfix, infix, power)?; | ||
fold_prefix(i, operand)? | ||
}; | ||
|
||
// A variable to stop the `'parse` loop when `Assoc::Neither` with the same | ||
// precedence is encountered e.g. `a == b == c`. `Assoc::Neither` has similar | ||
// associativity rules as `Assoc::Left`, but we stop parsing when the next operator | ||
// is the same as the current one. | ||
let mut prev_op_is_neither = None; | ||
'parse: while i.eof_offset() > 0 { | ||
// Postfix unary operators | ||
let start = i.checkpoint(); | ||
if let Some(Postfix(power, fold_postfix)) = | ||
opt(trace("postfix", postfix.by_ref())).parse_next(i)? | ||
{ | ||
// control precedence over the prefix e.g.: | ||
// `--(i++)` or `(--i)++` | ||
if power < min_power { | ||
i.reset(&start); | ||
break 'parse; | ||
} | ||
operand = fold_postfix(i, operand)?; | ||
|
||
continue 'parse; | ||
} | ||
|
||
// Infix binary operators | ||
let start = i.checkpoint(); | ||
let parse_result = opt(trace("infix", infix.by_ref())).parse_next(i)?; | ||
if let Some(infix_op) = parse_result { | ||
let mut is_neither = None; | ||
let (lpower, rpower, fold_infix) = match infix_op { | ||
Infix::Right(p, f) => (p, p - 1, f), | ||
Infix::Left(p, f) => (p, p + 1, f), | ||
Infix::Neither(p, f) => { | ||
is_neither = Some(p); | ||
(p, p + 1, f) | ||
} | ||
}; | ||
if lpower < min_power | ||
// MSRV: `is_some_and` | ||
|| match prev_op_is_neither { | ||
None => false, | ||
Some(p) => lpower == p, | ||
} | ||
{ | ||
i.reset(&start); | ||
break 'parse; | ||
} | ||
prev_op_is_neither = is_neither; | ||
let rhs = expression_impl(i, parse_operand, prefix, postfix, infix, rpower)?; | ||
operand = fold_infix(i, operand, rhs)?; | ||
|
||
continue 'parse; | ||
} | ||
|
||
break 'parse; | ||
} | ||
|
||
Ok(operand) | ||
} | ||
|
||
pub struct Prefix<I, O, E>(i64, fn(&mut I, O) -> PResult<O, E>); | ||
|
||
impl<I, O, E> Clone for Prefix<I, O, E> { | ||
#[inline(always)] | ||
fn clone(&self) -> Self { | ||
Prefix(self.0, self.1) | ||
} | ||
} | ||
|
||
impl<I: Stream, O, E: ParserError<I>> Parser<I, Prefix<I, O, E>, E> for Prefix<I, O, E> { | ||
#[inline(always)] | ||
fn parse_next(&mut self, input: &mut I) -> PResult<Prefix<I, O, E>, E> { | ||
empty.value(self.clone()).parse_next(input) | ||
} | ||
} | ||
|
||
pub struct Postfix<I, O, E>(i64, fn(&mut I, O) -> PResult<O, E>); | ||
|
||
impl<I, O, E> Clone for Postfix<I, O, E> { | ||
#[inline(always)] | ||
fn clone(&self) -> Self { | ||
Postfix(self.0, self.1) | ||
} | ||
} | ||
|
||
impl<I: Stream, O, E: ParserError<I>> Parser<I, Postfix<I, O, E>, E> | ||
for (i64, fn(&mut I, O) -> PResult<O, E>) | ||
{ | ||
#[inline(always)] | ||
fn parse_next(&mut self, input: &mut I) -> PResult<Postfix<I, O, E>, E> { | ||
empty.value(Postfix(self.0, self.1)).parse_next(input) | ||
} | ||
} | ||
|
||
pub enum Infix<I, O, E> { | ||
Left(i64, fn(&mut I, O, O) -> PResult<O, E>), | ||
Right(i64, fn(&mut I, O, O) -> PResult<O, E>), | ||
Neither(i64, fn(&mut I, O, O) -> PResult<O, E>), | ||
} | ||
|
||
impl<I, O, E> Clone for Infix<I, O, E> { | ||
#[inline(always)] | ||
fn clone(&self) -> Self { | ||
match self { | ||
Infix::Left(p, f) => Infix::Left(*p, *f), | ||
Infix::Right(p, f) => Infix::Right(*p, *f), | ||
Infix::Neither(p, f) => Infix::Neither(*p, *f), | ||
} | ||
} | ||
} | ||
|
||
impl<I: Stream, O, E: ParserError<I>> Parser<I, Infix<I, O, E>, E> for Infix<I, O, E> { | ||
#[inline(always)] | ||
fn parse_next(&mut self, input: &mut I) -> PResult<Infix<I, O, E>, E> { | ||
empty.value(self.clone()).parse_next(input) | ||
} | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
use crate::ascii::digit1; | ||
use crate::combinator::fail; | ||
use crate::dispatch; | ||
use crate::error::ContextError; | ||
use crate::token::any; | ||
|
||
use super::*; | ||
|
||
fn parser<'i>() -> impl Parser<&'i str, i32, ContextError> { | ||
move |i: &mut &str| { | ||
use Infix::*; | ||
expression(digit1.parse_to::<i32>()) | ||
.current_precedence_level(0) | ||
.prefix(dispatch! {any; | ||
'+' => Prefix(12, |_, a| Ok(a)), | ||
'-' => Prefix(12, |_, a: i32| Ok(-a)), | ||
_ => fail | ||
}) | ||
.infix(dispatch! {any; | ||
'+' => Left(5, |_, a, b| Ok(a + b)), | ||
'-' => Left(5, |_, a, b| Ok(a - b)), | ||
'*' => Left(7, |_, a, b| Ok(a * b)), | ||
'/' => Left(7, |_, a, b| Ok(a / b)), | ||
'%' => Left(7, |_, a, b| Ok(a % b)), | ||
'^' => Left(9, |_, a, b| Ok(a ^ b)), | ||
_ => fail | ||
}) | ||
.parse_next(i) | ||
} | ||
} | ||
|
||
#[test] | ||
fn test_expression() { | ||
assert_eq!(parser().parse("-3+-3*4"), Ok(-15)); | ||
assert_eq!(parser().parse("+2+3*4"), Ok(14)); | ||
assert_eq!(parser().parse("2*3+4"), Ok(10)); | ||
} | ||
} |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If this is going to start off unstable, then its fine noting most of my feedback in the "tracking" issue and not resolving all of it here |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -166,6 +166,8 @@ mod multi; | |
mod parser; | ||
mod sequence; | ||
|
||
pub mod expression; | ||
|
||
#[cfg(test)] | ||
mod tests; | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks to be agnostic of streaming support like
separated
is