Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outreachy Summer 2023: Improving error reporting in existing ppxlib-based PPXs #389

Closed
panglesd opened this issue Feb 22, 2023 · 10 comments
Closed

Comments

@panglesd
Copy link
Collaborator

panglesd commented Feb 22, 2023

Outreachy Summer 2023: Improving error reporting in existing ppxlib-based PPXs

This issue is about a general problem in the PPX ecosystem in error reporting. The action points are listed at the end, with links to the relevant issues.

The issue is targeted at Outreachy applicants. Welcome, we are glad to have you!

First, I link to many documentations specially related to the issue, that should help a lot to understand what is needed to solve. Then, I present the general issue. Finally, I include a section on how to contribute.

Ressources to understand the issue

The first step is to have a working setup for OCaml. The Get Up and Running with OCaml guide will help you with that.

The general metaprogramming guide on OCaml.org is a must-read if you are not comfortable with what a PPX is, what is an AST (Abstract Syntax Tree), etc.

The ppxlib manual presents information that are more specific to ppxlib. Some sections are particularly interesting to understand this issue: the general explanation of the different phases of rewriting, as well as the guide on handling errors, especially the migration part!
Other parts of the ppxlib manual, such as code generation and destruction, will be useful when contributing to PPXs, to understand the code, but not directly to understand the issue.

Where to ask

The documentation might not always be that straight-forward. That's even more of a reason that you should never hesitate to ask! If you have a question concerning one concrete issue, you can just ask on the issue. For general questions, you can ask on discord, which you can access via this invitation link. Once you've joined discord, if you have a question about PPX that's not related to some issue, you can ask on the PPX channel. If you have a general question about something else, you can ask on another discord channel, for example on the Outreachy channel.
You can also contact me privately as a mentor, using the information found in the outreachy project.

Error reporting with ppxlib, current version

ppxlib supports several ways of reporting errors:

  1. Raising a located error
  2. Embedding the error in the AST, in the form of an error node.

The first one, raising, is for critical failures of the rewriter that should stop the rewriting process. In such case, ppxlib will output the last valid AST, adding the caught error, and stop the rewriting process. This is necessary in very rare conditions, and considerably degrades the user experience.

In contrast, embedding errors in the AST is transparent for ppxlib. The rewriting process will continue in the subsequent stages. Not only the error reporting can be finer-grained (there can be more than one error reported) but part that worked can still be outputted. Finally, it does not prevent other rewriters that would be run in the same stage to be run.

So, from a user point of view, it is really better when errors are embedded. As the rewriting process is not stopped, not only it potentially reports more errors (those from other PPXs) but also less false positives (not running a PPX that was expected to run will certainly generate errors, reported by Merlin).

Problems with ppxlib's current behaviour in case of error raised

Most PPXs were written using the raising error reporting. They were written at a point where the semantics of raising was not clearly stated, and the other option (embedding errors) not given any visibility.

Since, in all currently written PPXs, raising was not meant to interrupt the rewriting process, we need to adapt ppxlib's behaviour to this reality.

The new behaviour, in case of a caught exception, should be that the exception is turned into an error node by ppxlib, embedded into the most meaningful AST that ppxlib can use. So, for instance, if a whole file transformation is raising, the error can be embedded at the beginning of the last valid AST. For a context-free rewriting, the error can be embedded locally in the AST.

Problems with PPXs raising instead of embedding errors

Even if we change ppxlib's behaviour regarding exceptions, there is still values in turning the raising exceptions into embedding errors:

  • Multiple errors can be reported
  • Parts of the AST that can be written won't be discarded

For instance, if a deriver can only derive from record types, and output an error on non-record types, the following code

type invalid1 = int
and invalid2 = int
and valid = { a : int } [@@deriving only_records]

can only report one error, and won't be able to generate the part of the AST corresponding to valid.

We expect newly written PPX to adopt the "embedding error" style of reporting errors, since many move have been made in ppxlib to promote it:

However, many existing PPX would benefit from switching to this way of reporting errors.

What to do

The plan has two parts:

  • Modify ppxlib itself to adopt the new behaviour in case of caught exceptions. This is one of the main task of this project. It also includes documenting the change!
  • Contribute directly to the error reporting of some important PPXs of the ecosystem. Not only that would improve their UX, but also give good examples for other PPXs, ensuring good practices.

This issue centralizes a list of PPXs where we want to turn the raised error into error extension nodes that are embedded in the AST. Their status will be updated as PR are merged!

Contributing

This Outreachy project includes both contributing to the ppxlib repository as well as contributing to PPXs.

A decent first issue in the ppxlib repository is #391. All other issues for the application period live in other repositories!

Most PPX that use the raise_errorf would benefit to be updated to embed errors, in order to report multiple errors. I include a non-exhaustive list of such PPXs, with links to the relevant issue (I will update the links as I open issues on the repositories). I tried to make a distinction between the PPXs where it is easy to do error embedding instead of raising, from those where it is hard. But, this is just a guess from my side.

Before making a contribution, present yourself in the linked issue, and make sure the repository owner is active and willing to accept contributions!

"Easier" to migrate PPXs:

"Harder" to migrate PPXs:

@marrious11
Copy link
Contributor

Hello to everyone 👋🏻.

I'm curious if there are any Outreachy Summer Interns working on the Improving error reporting in existing ppxlib-based PPXs issues as well. 💭 💭 💭

I'd appreciate it if we worked as a team. 💚
Please do not hesitate to contact me via email. 💌
Perhaps you can find more information to reach out to me here 😊

@panglesd
Copy link
Collaborator Author

I'll add those two issues, which are not really related to the project, but are good first issues to ppxlib: #404 and #405.

If you are interested working on one of them, comment in the linked issue and I can give a little it more information.

@Annosha
Copy link

Annosha commented Mar 23, 2023

@panglesd I would like to work on issue #405 please. TY!

@panglesd
Copy link
Collaborator Author

@Annosha you can of course (but aren't you already working on janestreet/ppx_expect#45?)

@Annosha
Copy link

Annosha commented Mar 23, 2023

@panglesd Yes, I'm working on it too. Was planning to work on both in parallel. Anyway, when I'm done working on the first issue I will come to this one.

@pitag-ha
Copy link
Member

pitag-ha commented Apr 5, 2023

Hi @marrious11, let me follow up on our conversation about growing/learning via open-source contributions on this issue dedicated to Outreachy-related conversations.

The skills you're mentioning sound great and are indeed very important to learn. About whether and how to learn them by contributing to OCaml community projects: First of all, my experience is that the OCaml community is generally very nice and welcoming to newcomers, so it's a good community to grow in. Whether the OCaml community concretely or open-source communities, in general, are a good fit for you to learn those kind of skills depends on your level of independence. What I mean is the following: For the Outreachy periods (first contribution period and then internship period), we have dedicated mentors who help and guide the mentees (i.e. first the applicants and later the interns) through their contributions. Outside the Outreachy context, the project maintainers also help and guide any contributor, but the help is far less intense and less directly responsive. So if you are a self-directed and independent learner, only needing a little bit of help and guidance from time to time, then learning those skills via open-source contributions sounds like a great idea! However, if you like learning in a structured and guided way, then it's probably better to first look into some courses or similar. Don't hesitate to ask more about this, if you have more questions (you or anyone else).

@marrious11
Copy link
Contributor

marrious11 commented Apr 5, 2023

Hi @pitag-ha 👋 !

Though this actually my ever first encounter with Open Source and the Ocaml Org as a community, thanks to Outreachy summer internship organizers.

Just to note, I have always learned in a structured and guided way, being a freshman in college in this case. One of my very first desire was to be able to have experience in real world implementation of concepts learned in college, which I believe to achieve that here within the community, as I have already started doing so.

However, as an applicant, I still find it worthy to want to continue working with my mentors and maintainers on an Ocaml project.

Nevertheless, I just want to continue to grow my technical skills as earlier mentioned, be it whether I will receive a lesser intense guide, I just want to improve my role with the open-source community by learning to writing more code and other related skills that matter as per the choice of project.

@pitag-ha
Copy link
Member

Hi Outreachy applicants,

I hope you all had a great time contributing to OCaml! Some folks working on improving the OCaml user experience for newcomers have recently started a survey to discover the main pain points for newcomers. If you could take a sec to fill it in, that would be super helpful: https://docs.google.com/forms/d/e/1FAIpQLSe4ZwOEQ_gj0o9vzCRJnKJqiQTZ9gI0qYE4QFsprAoltQUKlg/viewform

@marrious11
Copy link
Contributor

Hi @pitag-ha, yes, we did have an amazing time while contributing to OCaml. Okay, no problem, I'll give the form a full try.

@NathanReb
Copy link
Collaborator

I'm closing this since this outreachy round is over!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants