Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signature formatter creates very long lines when many parameters share a type #5424

Closed
david-christiansen opened this issue Sep 23, 2024 · 5 comments · Fixed by #5513
Closed
Labels
bug Something isn't working P-medium We may work on this issue if we find the time

Comments

@david-christiansen
Copy link
Contributor

david-christiansen commented Sep 23, 2024

Description

The signature formatter is used to display signatures for constants. If consecutive parameters share a type, then they are grouped, but these groups never get line breaks. This can lead to extremely long lines in the output of #check as well as in other contexts.

Context

This was discovered when looking at Verso docstring output for Lean.Meta.Simp.Config.mk.

image

While this particular issue is easy to work around by simply omitting constructors from the documentation output (as doc-gen does) and by setting a CSS overflow property, it could very well come up again elsewhere.

Steps to Reproduce

  1. Create a structure with many fields, each of which has the same type
  2. Run #check on its constructor, and note that the grouped parameters include no line breaks.

Here's an example:

structure ManyFields where
  nonNat : Bool
  a : Nat
  b : Nat
  a1 : Nat
  b1 : Nat
  a2 : Nat
  b2 : Nat
  a3 : Nat
  b3 : Nat
  a4 : Nat
  b4 : Nat
  a5 : Nat
  b5 : Nat
  a6 : Nat
  b6 : Nat
  a7 : Nat
  b7 : Nat
  a8 : Nat
  b8 : Nat
  a10 : Nat
  b10 : Nat
  a11 : Nat
  b11 : Nat
  a12 : Nat
  b12 : Nat
  a13 : Nat
  b13 : Nat
  a14 : Nat
  b14 : Nat
  a15 : Nat
  b15 : Nat
  a16 : Nat
  b16 : Nat
  a17 : Nat
  b17 : Nat
  a18 : Nat
  b18 : Nat
  a01 : Nat
  b01 : Nat
  a101 : Nat
  b101 : Nat
  a102 : Nat
  b102 : Nat
  a103 : Nat
  b103 : Nat
  a104 : Nat
  b104 : Nat
  a105 : Nat
  b105 : Nat
  a106 : Nat
  b106 : Nat
  a107 : Nat
  b107 : Nat
  a108 : Nat
  b108 : Nat
  a110 : Nat
  b110 : Nat
  a111 : Nat
  b111 : Nat
  a112 : Nat
  b112 : Nat
  a113 : Nat
  b113 : Nat
  anotherNonNat : String
  a114 : Nat
  b114 : Nat
  a115 : Nat
  b115 : Nat
  a116 : Nat
  b116 : Nat
  a117 : Nat
  b117 : Nat
  a118 : Nat
  b118 : Nat

#check ManyFields.mk

Expected behavior:

I would expect line breaks to be inserted in the parameters as necessary to avoid long lines, e.g. via Std.Format.fill.

Something like this:

ManyFields.mk (nonNat : Bool)
  (a b a1 b1 a2 b2 a3 b3 a4 b4 a5 b5 a6 b6 a7 b7 a8 b8 a10 b10 a11
   b11 a12 b12 a13 b13 a14 b14 a15 b15 a16 b16 a17 b17 a18 b18 a01
   b01 a101 b101 a102 b102 a103 b103 a104 b104 a105 b105 a106
   b106 a107 b107 a108 b108 a110 b110 a111 b111 a112 b112 a113
   b113 : Nat)
  (anotherNonNat : String)
  (a114 b114 a115 b115 a116 b116 a117 b117 a118 b118 : Nat) : ManyFields

Actual behavior:

Extremely long lines of output.

ManyFields.mk (nonNat : Bool)
  (a b a1 b1 a2 b2 a3 b3 a4 b4 a5 b5 a6 b6 a7 b7 a8 b8 a10 b10 a11 b11 a12 b12 a13 b13 a14 b14 a15 b15 a16 b16 a17 b17 a18 b18 a01 b01 a101 b101 a102 b102 a103 b103 a104 b104 a105 b105 a106 b106 a107 b107 a108 b108 a110 b110 a111 b111 a112 b112 a113 b113 :
    Nat)
  (anotherNonNat : String) (a114 b114 a115 b115 a116 b116 a117 b117 a118 b118 : Nat) : ManyFields

Versions

"4.12.0, commit bff638ef479b4f1feec3af92b2d9d4a29a06d4f8"

Mac OS 14.6.1

Additional Information

Impact

Add 👍 to issues you consider important. If others are impacted by this issue, please ask them to add 👍 to it.

@david-christiansen david-christiansen added the bug Something isn't working label Sep 23, 2024
@kmill
Copy link
Collaborator

kmill commented Sep 24, 2024

I believe this is from the way binders are formatted — the many1 binderIdent in explicitBinder doesn't appear to create any place for newlines to be inserted.

Here's an example of no wrapping:

syntax "baz " num* : term
#eval show Lean.CoreM _ from
  Lean.PrettyPrinter.formatTerm =<<
    `(term| baz 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
                1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
                1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)

Changing this to syntax "baz" (ppSpace num)* : term causes it to wrap.

Syntax definitions like Lean.Parser.Command.declSig use ppSpace between binders, which ensures they wrap. I guess we need to put ppSpace between binder identifiers too.

@digama0
Copy link
Collaborator

digama0 commented Sep 24, 2024

many1 binderIdent is already a problem because it causes identifiers to run together, as in (x✝y✝ : Nat). I thought I already fixed all of these a while ago...

@kmill
Copy link
Collaborator

kmill commented Sep 24, 2024

I'm not seeing any running-together identifiers @digama0 even though explicit binders are using many1 binderIdent. I believe this is because Lean.PrettyPrinter.Formatter.pushToken inserts spaces to prevent tokens from merging.

def foo (a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19 a20
         a21 a22 a23 a24 a25 a26 a27 a28 a29 a30 a31 a32 a33 a34 a35 a36 a37 a38 a39 a40
         a41 a42 a43 a44 a45 a46 a47 a48 a49 a50 a51 a52 a53 a54 a55 a56 a57 a58 a59 a60
         a61 a62 a63 a64 a65 a66 a67 a68 a69 a70 a71 a72 a73 a74 a75 a76 a77 a78 a79 a80
         a81 a82 a83 a84 a85 a86 a87 a88 a89 a90 a91 a92 a93 a94 a95 a96 a97 a98 a99 : Nat) : Nat := 0

#check foo
/-
foo
  (a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 ... and so on ... :
    Nat) :
  Nat
-/

@digama0
Copy link
Collaborator

digama0 commented Sep 24, 2024

This is only true for regular identifiers, when the identifiers are hygienic they are suffixed with a non-identifier character which makes the pretty printer think that it is better not to put a space because no space recommendation was used. (Note the daggers in my example.) It seems though that it is difficult to come up with easy examples now to demonstrate this behavior because e.g. the signature printer no longer shows hygienic names.

Here's a slightly contrived example based on the formatTerm.lean test taking advantage of the fact that the default ppTerm implementation adds daggers everywhere:

import Lean
open Lean PrettyPrinter

def fmt (stx : CoreM Syntax) : CoreM Format := do PrettyPrinter.ppCommand ⟨← stx⟩
#eval fmt `(command| def foo (x y : Nat) := x)
-- def foo✝ (x✝y✝ : Nat✝) :=
--   x✝

@leanprover-bot leanprover-bot added the P-medium We may work on this issue if we find the time label Sep 27, 2024
kmill added a commit to kmill/lean4 that referenced this issue Sep 29, 2024
…tokens

The formatter was using `tk ++ " "` to separate tokens from tokens they would merge with, but `" "` is not whitespace that could merge. This affected large binder lists, which wouldn't pretty print with any line breaks. Now they can be flowed across multiple lines.

Closes leanprover#5424
@kmill
Copy link
Collaborator

kmill commented Sep 29, 2024

I'm not sure yet what the best way to solve this inaccessible name issue is, but at least there's an easy fix for the long signature issue (#5513).

This fixes it at the formatter level instead of adding missing ppSpaces, which can be challenging to do without doubling up spacing. It seems like ppSpace ought to be idempotent...

github-merge-queue bot pushed a commit that referenced this issue Sep 29, 2024
…tokens (#5513)

The formatter was using `tk ++ " "` to separate tokens from tokens they
would merge with, but `" "` is not whitespace that could merge. This
affected large binder lists, which wouldn't pretty print with any line
breaks. Now they can be flowed across multiple lines.

Closes #5424
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P-medium We may work on this issue if we find the time
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants