-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add caret and spans like Trifecta (#234) #238
base: main
Are you sure you want to change the base?
Conversation
@@ -668,6 +668,18 @@ class ParserTest extends munit.ScalaCheckSuite { | |||
parseTest(Parser.stringIn(List("foo", "foobar", "foofoo", "foobat")).string, "foobal", "foo") | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add generators that create any new elements to the AST (in this PR that would be input
, but in a follow up it might be WithCaret
or something). That way we can check all the other properties in the presence of this new node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still want to address this to make sure all the previous laws pass.
see for instance:
def defer(g: GenT[Parser]): GenT[Parser] = |
as an example of how we want to transform some generators to make withSpan
and spanOf
and add them to here:
(1, rec.map { gen => GenT(!gen.fa) }), |
and:
(1, rec.map(defer(_))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you added the generators, but I don't see that they were added to the root generator so we don't exercise all the existing laws in the presence of these operations.
Thanks for sending this PR! I think we will find a way to merge something close to this. |
I am thinking about |
/** return the result of the parser together | ||
* with the position of the starting of the consumption | ||
*/ | ||
def careted0[A](pa: Parser0[A]): Parser0[(A, Caret)] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is confusing. The caret point before the A, but we return it after. Also, I wonder if we should bother to make this method since it is really just (caret ~ pa).map(_.swap)
.
For instance, if we change the order, as I would suggest as a mnemonic to remember the caret comes first, then it is just (caret ~ pa)
which is probably too small for a method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea was that pa
was primary and the caret was secondary so the result of pa
comes first, but yes that would cause confusion about where is the caret exactly. If you prefer putting the caret at the first element I'll change it.
I think pa.careted
is more readable than (caret ~ pa)
in the sense that it associates tighter in sight which indicates it is somehow more of a single parsing unit. Indeed we can leave this out and let the user make the choice whether to write this helper instead since it is really short.
override def parseMut(state: State): Caret = { | ||
val (row, col) = state.locmap | ||
.toLineCol(state.offset) | ||
.getOrElse(throw new IllegalStateException("the offset is larger than the input size")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will throw when you are at the EOF right?
I think we need to fix that somehow.
I guess we can say that EOF should return toLineCol(eof - 1).map { case (row, col) => (row, col + 1) }
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I'll take a look at how I can solve that first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added GenT
for spanning combinators but not yet added to the list in gen(0)
. Will add after this is solved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#286 fixes this issue.
@@ -668,6 +668,18 @@ class ParserTest extends munit.ScalaCheckSuite { | |||
parseTest(Parser.stringIn(List("foo", "foobar", "foofoo", "foobat")).string, "foobal", "foo") | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still want to address this to make sure all the previous laws pass.
see for instance:
def defer(g: GenT[Parser]): GenT[Parser] = |
as an example of how we want to transform some generators to make withSpan
and spanOf
and add them to here:
(1, rec.map { gen => GenT(!gen.fa) }), |
and:
(1, rec.map(defer(_))), |
@@ -668,6 +668,18 @@ class ParserTest extends munit.ScalaCheckSuite { | |||
parseTest(Parser.stringIn(List("foo", "foobar", "foofoo", "foobat")).string, "foobal", "foo") | |||
} | |||
|
|||
test("Parser.careted and spanned works") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, I'd like to see a law like pa.parse(str) == pa.withSpan.map(_._1).parse(str)
and pa.spanOf.parse(str) == pa.withSpan.map(_._2).parse(str)
something like that.
|
||
case class Caret(offset: Int, row: Int, col: Int) | ||
object Caret { | ||
implicit val eqCatsCaret: Eq[Caret] = Eq.fromUniversalEquals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about making an Order
instead and order by the offset
? We might want to sort by carets.
|
||
case class Span(from: Caret, to: Caret) | ||
object Span { | ||
implicit val eqCatsCaret: Eq[Span] = Eq.fromUniversalEquals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also we could make an Order
here where we compare first by from, then to.
implicit val eqCatsCaret: Eq[Caret] = Eq.fromUniversalEquals | ||
} | ||
|
||
case class Span(from: Caret, to: Caret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a comment if from and to are inclusive/exclusive? I assume to
is exclusive.
/** return the result of the parser together | ||
* with the position of the starting of the consumption | ||
*/ | ||
def withCaret: Parser0[(A, Caret)] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about call this def caretBefore: Parser0[(Caret, A)]
or something?
I'm really nervous withCaret
and the fact that the caret is on the right is going to suggest to people that the caret is after the parse.
@@ -668,6 +668,18 @@ class ParserTest extends munit.ScalaCheckSuite { | |||
parseTest(Parser.stringIn(List("foo", "foobar", "foofoo", "foobat")).string, "foobal", "foo") | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you added the generators, but I don't see that they were added to the root generator so we don't exercise all the existing laws in the presence of these operations.
do you think you will be able to address the comments I left? |
Sorry, I think I've been busy in August. I may be able to address them in September. |
I'm planning to publish a new version soon if you have time to pick this back up. |
Resolves #234; adds
Parser(0)#withCaret, span, withSpan
andParser.caret
.I'm unsure about the performance of these combinators but I expect it to be not very good. This is due to recomputation in
LocationMap
each time the combinators are used. If this is much of a concern, this PR may be incapable. A probable optimization is to store the currentPosition
(incl. row, col) in theState
and update by need when the combinators are called (this is the approach of Megaparsec).From Megaparsec's
getSourcePos
combinator which functions alike: