Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatter - initial basic implementation #144

Merged
merged 36 commits into from
Dec 8, 2023
Merged

Formatter - initial basic implementation #144

merged 36 commits into from
Dec 8, 2023

Conversation

aabounegm
Copy link
Member

@aabounegm aabounegm commented Nov 19, 2023

This implements a very basic formatter that scans the lexical tokens for some known patterns and applies text edits to have them adhere to the style guide. The code is still in an early stage and can definitely be improved a lot, so a code review would be tremendously helpful.

Also, an issue was discovered with LSP's VFS wherein it includes carriage return (\r) character on Windows, causing issues with the parser. A workaround has been added (filter (/= '\r')), but can probably be improved.

Notable features

  • Adds a space after the opening parenthesis to help with the code tree structure
  • Puts the definition conclusion (type, starting with :) and construction (body, starting with :=) on new lines
  • Adds a space after the \ of a lambda term and around binary operators (like ,)
  • Moves binary operators to the beginning of the next line if they appear at the end of the previous one.
  • Replaces common ASCII sequences with their Unicode equivalent
  • A CLI subcommand (rzk format) with --check and --write options

Known limitations

  • The 80 character line limit is currently not enforced due to the difficulty of determining where to add line breaks (for reference, check out this post by a Dart fmt engineer)
  • Fixing indentation is not yet implemented due to the need for more semantics than the lexer provides to determine indentation level.
  • Moving binary operators from the end of the line to the beginning of the next is not yet implemented. Edit: Implemented for , and ->. What other binary operators are needed?
  • No commands other than #define (and its alias #def) are handled yet.
  • There are rare edge cases in which applying the formatter twice would result in additional edits that were not detected the first time.

I will open separate issues to work on these in the future.

Possible inconsistencies in the style guide

These may not be inconsistencies as much as a lack of understand on my side, but I'd like to understand either way:

  • About the spaces after opening parenthesis, it seems that 2 styles have been used when it comes to nested parentheses. One example has ( ( a-term-of-a-type) ...) (with a space after the inner paren), while another is written as ( (x : A) ...) (without a space before the x)
  • Regarding the same recommendation about space in opening parentheses, it seems that it only applies when that open paren is the first character in its line. For example, it appears as : (x = z) in the type. However, in the sHoTT repo itself, a definition has : ( concat A w x z p ...) with a space after the paren. Which one should it be? (mostly clarified in Auto-format all files sHoTT#142 (comment) and partially implemented)

Planned improvements

  • Formatting configuration settings (preferably in rzk.yaml, but possible also .vscode/settings.json) for controlling options like auto-replacement of ASCII sequences, line width, ...

A pull request to the sHoTT repo (and others) will follow after more work is done into stabilizing the formatter and implementing more rules.

@fizruk fizruk merged commit 46e80df into develop Dec 8, 2023
7 checks passed
@fizruk fizruk deleted the formatter branch December 8, 2023 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants