add caret and spans like Trifecta (#234) #238

re-xyr · 2021-06-12T14:59:28Z

Resolves #234; adds Parser(0)#withCaret, span, withSpan and Parser.caret.

I'm unsure about the performance of these combinators but I expect it to be not very good. This is due to recomputation in LocationMap each time the combinators are used. If this is much of a concern, this PR may be incapable. A probable optimization is to store the current Position (incl. row, col) in the State and update by need when the combinators are called (this is the approach of Megaparsec).

From Megaparsec's getSourcePos combinator which functions alike:

Return the current source position. This function is not cheap, do not call it e.g. on matching of every token, that's a bad idea. Still you can use it to get SourcePos to attach to things that you parse.

core/shared/src/main/scala/cats/parse/Parser.scala

johnynek · 2021-06-14T18:35:19Z

core/shared/src/test/scala/cats/parse/ParserTest.scala

@@ -668,6 +668,18 @@ class ParserTest extends munit.ScalaCheckSuite {
    parseTest(Parser.stringIn(List("foo", "foobar", "foofoo", "foobat")).string, "foobal", "foo")
  }



please add generators that create any new elements to the AST (in this PR that would be input, but in a follow up it might be WithCaret or something). That way we can check all the other properties in the presence of this new node.

I still want to address this to make sure all the previous laws pass.

see for instance:

cats-parse/core/shared/src/test/scala/cats/parse/ParserTest.scala

Line 167 in 8c9a271

def defer(g: GenT[Parser]): GenT[Parser] =

as an example of how we want to transform some generators to make withSpan and spanOf and add them to here:

cats-parse/core/shared/src/test/scala/cats/parse/ParserTest.scala

Line 527 in 8c9a271

(1, rec.map { gen => GenT(!gen.fa) }),

and:

cats-parse/core/shared/src/test/scala/cats/parse/ParserTest.scala

Line 555 in 8c9a271

(1, rec.map(defer(_))),

I see that you added the generators, but I don't see that they were added to the root generator so we don't exercise all the existing laws in the presence of these operations.

johnynek · 2021-06-14T18:36:37Z

Thanks for sending this PR! I think we will find a way to merge something close to this.

re-xyr · 2021-06-15T14:24:27Z

I am thinking about Span. Do you think (Caret, Caret) will be better than a separate datatype?

core/shared/src/main/scala/cats/parse/Parser.scala

johnynek · 2021-06-15T17:59:38Z

core/shared/src/main/scala/cats/parse/Parser.scala

+  /** return the result of the parser together
+    * with the position of the starting of the consumption
+    */
+  def careted0[A](pa: Parser0[A]): Parser0[(A, Caret)] =


I wonder if this is confusing. The caret point before the A, but we return it after. Also, I wonder if we should bother to make this method since it is really just (caret ~ pa).map(_.swap).

For instance, if we change the order, as I would suggest as a mnemonic to remember the caret comes first, then it is just (caret ~ pa) which is probably too small for a method.

My idea was that pa was primary and the caret was secondary so the result of pa comes first, but yes that would cause confusion about where is the caret exactly. If you prefer putting the caret at the first element I'll change it.

I think pa.careted is more readable than (caret ~ pa) in the sense that it associates tighter in sight which indicates it is somehow more of a single parsing unit. Indeed we can leave this out and let the user make the choice whether to write this helper instead since it is really short.

core/shared/src/main/scala/cats/parse/Parser.scala

johnynek · 2021-06-15T18:05:00Z

core/shared/src/main/scala/cats/parse/Parser.scala

+      override def parseMut(state: State): Caret = {
+        val (row, col) = state.locmap
+          .toLineCol(state.offset)
+          .getOrElse(throw new IllegalStateException("the offset is larger than the input size"))


this will throw when you are at the EOF right?

I think we need to fix that somehow.

I guess we can say that EOF should return toLineCol(eof - 1).map { case (row, col) => (row, col + 1) }

What do you think?

Yes. I'll take a look at how I can solve that first.

Added GenT for spanning combinators but not yet added to the list in gen(0). Will add after this is solved.

#286 fixes this issue.

johnynek · 2021-06-15T18:08:00Z

core/shared/src/test/scala/cats/parse/ParserTest.scala

@@ -668,6 +668,18 @@ class ParserTest extends munit.ScalaCheckSuite {
    parseTest(Parser.stringIn(List("foo", "foobar", "foofoo", "foobat")).string, "foobal", "foo")
  }



I still want to address this to make sure all the previous laws pass.

see for instance:

cats-parse/core/shared/src/test/scala/cats/parse/ParserTest.scala

Line 167 in 8c9a271

def defer(g: GenT[Parser]): GenT[Parser] =

as an example of how we want to transform some generators to make withSpan and spanOf and add them to here:

cats-parse/core/shared/src/test/scala/cats/parse/ParserTest.scala

Line 527 in 8c9a271

(1, rec.map { gen => GenT(!gen.fa) }),

and:

cats-parse/core/shared/src/test/scala/cats/parse/ParserTest.scala

Line 555 in 8c9a271

(1, rec.map(defer(_))),

johnynek · 2021-06-15T18:23:02Z

core/shared/src/test/scala/cats/parse/ParserTest.scala

@@ -668,6 +668,18 @@ class ParserTest extends munit.ScalaCheckSuite {
    parseTest(Parser.stringIn(List("foo", "foobar", "foofoo", "foobat")).string, "foobal", "foo")
  }

+  test("Parser.careted and spanned works") {


also, I'd like to see a law like pa.parse(str) == pa.withSpan.map(_._1).parse(str) and pa.spanOf.parse(str) == pa.withSpan.map(_._2).parse(str)

something like that.

johnynek · 2021-06-30T19:20:53Z

core/shared/src/main/scala/cats/parse/LocationMap.scala

+
+case class Caret(offset: Int, row: Int, col: Int)
+object Caret {
+  implicit val eqCatsCaret: Eq[Caret] = Eq.fromUniversalEquals


what about making an Order instead and order by the offset? We might want to sort by carets.

johnynek · 2021-06-30T19:21:35Z

core/shared/src/main/scala/cats/parse/LocationMap.scala

+
+case class Span(from: Caret, to: Caret)
+object Span {
+  implicit val eqCatsCaret: Eq[Span] = Eq.fromUniversalEquals


also we could make an Order here where we compare first by from, then to.

johnynek · 2021-06-30T19:21:54Z

core/shared/src/main/scala/cats/parse/LocationMap.scala

+  implicit val eqCatsCaret: Eq[Caret] = Eq.fromUniversalEquals
+}
+
+case class Span(from: Caret, to: Caret)


can we add a comment if from and to are inclusive/exclusive? I assume to is exclusive.

johnynek · 2021-06-30T19:23:15Z

core/shared/src/main/scala/cats/parse/Parser.scala

+  /** return the result of the parser together
+    * with the position of the starting of the consumption
+    */
+  def withCaret: Parser0[(A, Caret)] =


what about call this def caretBefore: Parser0[(Caret, A)] or something?

I'm really nervous withCaret and the fact that the caret is on the right is going to suggest to people that the caret is after the parse.

johnynek · 2021-06-30T19:26:05Z

core/shared/src/test/scala/cats/parse/ParserTest.scala

@@ -668,6 +668,18 @@ class ParserTest extends munit.ScalaCheckSuite {
    parseTest(Parser.stringIn(List("foo", "foobar", "foofoo", "foobat")).string, "foobal", "foo")
  }



I see that you added the generators, but I don't see that they were added to the root generator so we don't exercise all the existing laws in the presence of these operations.

johnynek · 2021-08-18T18:46:19Z

do you think you will be able to address the comments I left?

re-xyr · 2021-08-19T01:10:22Z

Sorry, I think I've been busy in August. I may be able to address them in September.

johnynek · 2021-10-29T23:58:35Z

I'm planning to publish a new version soon if you have time to pick this back up.

johnynek reviewed Jun 14, 2021

View reviewed changes

johnynek reviewed Jun 15, 2021

View reviewed changes

re-xyr added 2 commits June 16, 2021 05:29

add caret and spans like Trifecta (#234)

bfc6100

add GenT for spanning combinators (WIP)

b65624a

johnynek reviewed Jun 30, 2021

View reviewed changes

johnynek mentioned this pull request Nov 11, 2021

Add Parser.caret #301

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add caret and spans like Trifecta (#234) #238

add caret and spans like Trifecta (#234) #238

re-xyr commented Jun 12, 2021 •

edited

Loading

johnynek Jun 14, 2021

johnynek Jun 15, 2021

johnynek Jun 30, 2021

johnynek commented Jun 14, 2021

re-xyr commented Jun 15, 2021

johnynek Jun 15, 2021

re-xyr Jun 15, 2021

johnynek Jun 15, 2021

re-xyr Jun 15, 2021

re-xyr Jun 15, 2021

johnynek Oct 29, 2021

johnynek Jun 15, 2021

johnynek Jun 15, 2021

johnynek Jun 30, 2021

johnynek Jun 30, 2021

johnynek Jun 30, 2021

johnynek Jun 30, 2021

johnynek Jun 30, 2021

johnynek commented Aug 18, 2021

re-xyr commented Aug 19, 2021

johnynek commented Oct 29, 2021

		@@ -668,6 +668,18 @@ class ParserTest extends munit.ScalaCheckSuite {
		parseTest(Parser.stringIn(List("foo", "foobar", "foofoo", "foobat")).string, "foobal", "foo")
		}

add caret and spans like Trifecta (#234) #238

Are you sure you want to change the base?

add caret and spans like Trifecta (#234) #238

Conversation

re-xyr commented Jun 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnynek commented Jun 14, 2021

re-xyr commented Jun 15, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnynek commented Aug 18, 2021

re-xyr commented Aug 19, 2021

johnynek commented Oct 29, 2021

re-xyr commented Jun 12, 2021 •

edited

Loading