Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unusual syntax for tuple patterns #192

Closed
alicelogos opened this issue Nov 17, 2024 · 3 comments · Fixed by #193
Closed

Unusual syntax for tuple patterns #192

alicelogos opened this issue Nov 17, 2024 · 3 comments · Fixed by #193

Comments

@alicelogos
Copy link
Contributor

The current syntax for tuple patterns is defined as follows in Syntax.cf:

PatternTuple.     Pattern ::= "(" Pattern "," Pattern "," [Pattern] ")" ;

The next line defines the separator for patterns:

separator nonempty Pattern "" ;

As a result, tuple patterns are written like (a , b , c d e). In my opinion, this syntax is unusual and potentially confusing.

  • It deviates from the standard tuple syntax commonly seen in other languages, such as (a, b, c, d, e) or (a , b , c , d , e).
  • It might be harder for users to read or understand, especially if they are unfamiliar with this syntax.

I would prefer a syntax where the elements are consistently separated by commas, such as (a , b , c , d , e).

I couldn’t find any discussion in #183 justifying the current design choice. Could you provide a rationale for this design choice? Alternatively, would you consider revising the syntax?

Example code:

Using the current syntax, the following code type-checks successfully:

#lang rzk-1

#define current-syntax
  ( A : U)
  ( B : (a : A) → U)
  ( C : (a : A) → (b : B a) → U)
  ( D : (a : A) → (b : B a) → (c : C a b) → U)
  ( E : (a : A) → (b : B a) → (c : C a b) → (d : D a b c) → U)
  ( ( a , b , c d e) : Σ (a : A , b : B a , c : C a b , d : D a b c) , E a b c d)
  : A
  := a

Using the proposed syntax, however, results in a syntax error at ( a , b , c , d , e):

#lang rzk-1

#define proposed-syntax
  ( A : U)
  ( B : (a : A) → U)
  ( C : (a : A) → (b : B a) → U)
  ( D : (a : A) → (b : B a) → (c : C a b) → U)
  ( E : (a : A) → (b : B a) → (c : C a b) → (d : D a b c) → U)
  ( ( a , b , c , d , e) : Σ (a : A , b : B a , c : C a b , d : D a b c) , E a b c d)
  : A
  := a
@fizruk
Copy link
Member

fizruk commented Nov 17, 2024

@alicelogos Thanks a lot for spotting this inconsistency! I believe, this problem is unintentional, just slipped past me somehow (I think, it was because we only tested with 3-tuples).

The Patterns are separated by spaces when they are introduced in a $\lambda$-abstraction. However, in a TuplePattern they of course need to be separated by commas. To fix that, you might want to introduce a wrapper for Pattern:

APatternInTuple. PatternInTuple ::= Pattern ;
separator PatternInTuple "," ;
...
PatternTuple.     Pattern ::= "(" Pattern "," Pattern "," [PatternInTuple] ")" ;

There are other possible fixes, but I think the one above might be the simplest. Would you be interested in making a PR with a fix along these lines?

@alicelogos
Copy link
Contributor Author

Thank you for clarifying. I believe you mean the following fix:

PatternTuple.     Pattern ::= "(" Pattern "," Pattern "," [Pattern1] ")" ;

_. Pattern1 ::= Pattern ;
separator nonempty Pattern1 "," ;

This approach doesn’t introduce new types or constructors, so no changes are needed in the Haskell code. I tested this fix with the example code I provided earlier and it works. However, I’m not familiar with the full codebase, so I’m unsure if other code need modification or if this fix is the optimal solution. If you think this fix is good, I’m happy to create a pull request with it.

Additionally, I want to mention the error message Rzk without the fix reports on proposed-syntax:

rzk: syntax error at line 9, column 17 before `,'
CallStack (from HasCallStack):
  error, called at src/Rzk/Main.hs:87:11 in rzk-0.7.5-EJ2uiUYTFnEGW7fnYMZZSu:Rzk.Main

And the error message Rzk with the fix reports on current-syntax:

rzk: syntax error at line 9, column 17 before `d'
CallStack (from HasCallStack):
  error, called at src/Rzk/Main.hs:87:11 in rzk-0.7.5-inplace:Rzk.Main

These messages don’t provide hints about the reason of the syntax error or possible fixes. When I encountered a syntax error with something like (a , b , c , d , e), I checked Syntax.cf to see if my understanding of the grammar was correct, and then I found the “correct but unintentional” syntax. While this approach works, it might be more user-friendly if the error messages include hints about the reason of the syntax error or possible fixes.

I suspect that this lack of clarity in error messages might be related to Emily’s report on a “weird error message” in #141, though I’m not sure because I don’t know the exact error message Emily encountered. (The message “expected a pattern” was different with what I encountered, but I think was more informative than what I encountered.)

I couldn't figure out what rzk was trying to tell me.

Comment by Emily Riehl

I’m not familiar with BNFC, so I’m unsure if adding hints should be addressed there or within Rzk. I understand that implementing better hints may require significant effort and isn’t critical since users can refer to Syntax.cf if needed. However, I’d like to suggest that enhancing error messages could be a valuable improvement for the future.

@fizruk
Copy link
Member

fizruk commented Nov 19, 2024

This approach doesn’t introduce new types or constructors, so no changes are needed in the Haskell code.

Excellent! Too bad I haven't thought of this myself :) Your PR is most welcome, of course!

While this approach works, it might be more user-friendly if the error messages include hints about the reason of the syntax error or possible fixes.

This is very true. I do not think that BNFC (at least in its current form) facilitates good error messages. I also think that at the grammar level we actually might not want to split expressions and patterns. Keeping expressions and patterns as one syntactic group, BNFC has more chances to parse user's input without issues and we then have the opportunity to produce a nicer error message when going from such loose syntax to something more strict.

Basically, the idea is to parse "preterms" (with a high chance of successful parsing) and check well-formedness when converting into more strict terms (or directly into the scope-safe AST). If conversion fails, we can give a much better response to the user compared to a stock parser generator.

I understand that implementing better hints may require significant effort and isn’t critical since users can refer to Syntax.cf if needed. However, I’d like to suggest that enhancing error messages could be a valuable improvement for the future.

I think better error messages and localization is in fact critical to the usability of the tool (alongside performance and feature set). Also, I wouldn't want (regular) users to go to Syntax.cf for consultations on the syntax :)

However, as you say, for the prototype (which Rzk still is), this was of low priority due to extra effort in the implementation. Still, thanks to @aabounegm we were able to hook VS Code extension to at least try to highlight exactly the correct places in the code the error messages we get. That said, a future rewrite of Rzk should definitely have better error message support in mind, since there are tons of nuances to hit, especially for beginners.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants