Skip to content

Dev meeting 16 01 2024

Sonja Heinze edited this page Jan 18, 2024 · 1 revision

Agenda

  • Getting rid of the embed-errors=true/false logic in favor of always embedding errors.
    • Do we really want that?
    • What would be the performance impact on builds with PPXs?
  • Pros and cons of the two different approaches to make ppxlib a weak dependency only:
  • The PRs on error reporting.
  • The upcoming 0.32.0 release.
    • Reason for the release: Attribute API addition
    • Caution for the release: Caml -> Stdlib change
    • Heads-up for the release procedure: We'll need to write release notes for the OCaml changelog

Present

  • Sonja / @pitag-ha
  • Nathan / @NathanReb
  • Paul-Elliot / @panglesd
  • Carl / @ceastlund

Notes

Welcome back, Nathan!! 🎉🚀

Error reporting PRs

Nathan has reviewed Burnley's two PRs improving the error reporting. The first one is merged. Some of Paul-Elliot's comments won't be very easy to implement, such as the proposal to refactor.

The errors end up the wrong way around. The order got messed up with a small change. The current ordering is fragile. That's what the sorting errors by location PR is fixing.

Question about the second PR (which embeds errors of exceptions during the context-free phase, i.e. extension nodes and derivers): There's a choice to make where to embed the errors: At the beginning of the AST, or at the node / subtree where they show up (i.e. for extension nodes, replacing the extension node; for derivers, adding the error node where the deriver would have generated nodes).

Nathan is pointing out main advantages of the latter: Coherence with what we ask PPX authors to do (embed the errors themselves) and no duplication of error messages in case of extension nodes, where the compiler will also emit an error if the node doesn't get replaced.

We all agree on that. Additionally, Sonja and Paul-Elliot also point out a few advantages of the other approach: Coherence with error collecting from other phases, less potential for later coming whole-file transformations to mess up embedded errors. We all agree that the latter shouldn't be taken too much into consideration.

Paul-Elliot has already talked to Burnley and he's happy with either decision we make.

Also, we all agree that there are more small improvements regarding error reporting that can be done. But for now, we want to merge the big improvements in the two PRs. More details can always be improved later.

Action-point Nathan: Finish reviewing the second PR.

Action-point Nathan: Decide if the current chosen approach of embedding the collected errors at the beginning of the AST is good or if he wants to change the PR to embed them on the spot.

Action-point Paul-Elliot and Sonja: Be reactive on Nathan's comments on the PR.

Action-point Nathan: Move the new docs section regarding error reporting to the driver specifications section.

The error-embedding=true/false logic

Nathan had already mentioned before the meeting that he'd like to discuss simplifying the code of the ppxlib driver by getting rid of the error-embedding=true/false logic.

The opinion on that is mixed. Everyone agrees that that would considerably simplify the code, so most maintainers are in favor of it. However, Sonja isn't convinced that the code simplification would pay off for the performance cost on builds with PPXs: Getting rid of the distinction would mean always embedding errors (needed for merlin). However, for compilation, always embedding errors would add unnecessary performance overhead without adding any extra value. Concretely, whenever there's an error, instead of raising it, the driver would finish all its work and return an AST with embedded error(s). We've clarified that that would even be the case for parse errors!

In large projects (at Janestreet for example, but also others) where build performance matters, this kind of behaviour can have a significant impact and having the driver exit early can save build time.

We've also discussed whether embedding parse errors on embed-errors=true is the right choice for the driver and have come to the conclusion that it is: When a caller, such as the compiler with -pp option, asks the driver to embed errors, it expects the driver to embed all errors. If we changed that, we'd break the current behavior of embedding the PPX driver in the compilation.

Finally, Nathan made a very good point pointing out that encouraging ppx authors to embed errors does not help with build times there as it "prevents" the driver from exiting early in case of errors. To have the best of both worlds we could add an API entry that builds an error extension point when embed-errors is true and raises a located exception when it's false. Ppx-es following the guideline would then both have good error reporting for merlin and help limit driver's running time in erroneous builds. This is an improvement over the long run though as it would require the ecosystem to upgrade to benefit from it.

In summary, both embedding errors and exiting early have their advantages and should be supported by the driver so no changes there. We might consider refactoring things a bit to make sure things are consistent but nothing pressing.

No action points for now.

ppx_deriving_inline vs dune repo approach

Nathan is working on allowing dune to use ppx without using ppx_deriving_inline. For details on how this all works here are pointers to the issue and PR.

As a quick summary, the dune repo will contain preprocessed versions of the source files that use ppx. The dune bootstrapper knows to use those versions in place of the original source file if they are available. Building dune in development requires having ppxlib and the set of ppx installed as for any other projects. The preprocessed versions are kept up to date by dune's developpers and CI.

Dune is a very specific use case because it cannot depend on ppxlib. This approach is not meant to be made accessible to other users. The reason why Sonja wants to discuss this is that she's already heard twice the opinion (not from ppxlib maintainers) that it would be nice to make it accessible to others (once implemented) and she's quite skeptical.

Convenience

ppx_deriving_inline would be more convenient if it had specific tooling, such as better editor support (to fold the generated code) both in the editor and on GitHub, as well as specific diffs and greps which can ignore the generated code. The dune maintainers find ppx_deriving_inline without those specific tools too inconvenient.

The dune repo approach doesn't need any of that.

Scope

ppx_deriving_inline only works for derivers and the approach of inlining generated code can only work for derivers. On one hand that's limiting. However, on the other hand, it's a safeguard: It's technically not possible to use ppx_deriving_inline on expansion nodes with e.g. conditional compilation. With the dune repo approach, those things would be possible and the responsibility is with the maintainers not to mess things up.

Reliability and synchronization

Both approaches pretty print the AST. The compiler doesn't give any guarantee for the AST pretty printer to be reliable. There's no guarantee that it won't change the semantics of the program. It's quite unlikely that that happens, but theoretically it could.

At least, with ppx_deriving_inline, the project's structure is the same during development and in releases. So the pretty printed code is already tested during development and in CI. However, the dune repo approach establishes a breach between the project's PPX structure during development and for releases. So if that approach was adapted on other repos (without bootstrapper), problems would only show up in releases, not during development/CI.

Nathan explained how the dune bootstrapper works and why the structure breach is not a big problem in the particular use case of dune: The bootstrapper uses the same PPX structure as the releases. Also, the bootstrapper is built in CI and frequently in development.

In any case, no action points, just discussions :)

Release

We've discussed a few points about the release. Nathan has asked whether to build a PPX universe for that release. We think there's no need for that: There's only one small change that could theoretically be breaking (renaming Caml to Stdlib) and on the old universe nobody breaks. Nathan will look closely at the opam-repo CI.

Sonja has also mentioned that Platform tool releases nowadays come with an entry on the OCaml changelog.

Action-point Nathan: Cut the 0.32.0 release

Action-point Nathan: Write an OCaml changelog entry for the release

Clone this wiki locally