Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SARIF output #579

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

mathroule
Copy link

@mathroule mathroule commented Jun 16, 2023

Description

Add SARIF output and fixes #578.

Changes

  • Refactor formatters to support SARIF output (line by line output was not suitable)
  • Create a dedicated formatters module
  • Add rule to syntax lint problem

@mathroule mathroule force-pushed the add-sarif-output branch 2 times, most recently from 99d4277 to 0913f5d Compare June 16, 2023 18:15
@mathroule mathroule marked this pull request as draft June 16, 2023 18:15
@mathroule mathroule marked this pull request as ready for review June 16, 2023 18:18
@mathroule mathroule force-pushed the add-sarif-output branch 2 times, most recently from 0cfcac0 to c297186 Compare June 16, 2023 19:08
@mathroule mathroule force-pushed the add-sarif-output branch 2 times, most recently from c297186 to cc23c14 Compare June 26, 2023 14:41
Signed-off-by: Mathieu Rul <[email protected]>
Signed-off-by: Mathieu Rul <[email protected]>
Signed-off-by: Mathieu Rul <[email protected]>
Signed-off-by: Mathieu Rul <[email protected]>
Signed-off-by: Mathieu Rul <[email protected]>
@mathroule
Copy link
Author

@adrienverge do you think SARIF support can be added to YAML lint?

I saw your comment #441 (comment), indeed SARIF can be seen as third-party, but it's intended to be a standard compared to JUnit or other formats.

@adrienverge
Copy link
Owner

Hello Mathieu, please excuse my late reply.

You understood it right in #441 (and other discussions): new output formats are not planned for addition. There would be too many of them. I'm sorry.

@mathroule
Copy link
Author

That's really unfortunate and I can understand your point for not adding a new format. But I would like to insist on the fact that SARIF is standard, from the doc it says:

The Static Analysis Results Interchange Format (SARIF) has been approved as an OASIS standard. The information and tools on this web site apply to SARIF Version 2.1.0, the version approved by the OASIS.

Since SARIF is a standard, it is supported by GitHub and SonarQube out of the box.

@adrienverge
Copy link
Owner

Hello Mathieu, thanks for insisting. About the pull request itself, I have a few comments.

Is the first commit (Refactor problems output) really needed? I'm afraid that it postpones the printing of problems to yamllint's output, which is not optimal: it delays results that could be printed on the fly; also, in case of a crash on the last YAML file, problems of previous files won't be shown. Another approach could be to use instances of formatter classes (see my proposal below), and each problem found could be immediately printed with print(sarifFormatter.format(file, problems)). For formatters (like SARIF) that need to print {"$schema": ... and ...]} before/after yamllint's results, it could be done with print(sarifFormatter.header()) and print(sarifFormatter.footer()).

About the second commit (Move formatters to dedicated module): given the small size of each formatter code, I'd rather avoid creating a new module yamllint.formatters. What about just a new file formatters.py, possibly with classes StandardFormatter, GitHubFormatter etc. that implement format(file, problems)?
With function names please keep ..._problems() (not ..._results()), for consistency. Otherwise there's an unclear mix of _results and _problem.
There's not much sense that format_results() returns max_level: format_results() is just about printing text. The error level logic would be better kept outside formatters.

In the third commit (Add support for SARIF format), please set the copyright of sarif.py to Mathieu Rul 😉
It looks like the output data has many extra informations. Are there needed? Something minimalist would be better in my opinion. Can you keep only what's really required? For example, help and helpUri if they are not mandatory, because the documentation URI is subject to change; same for properties.queryUri which doesn't look intended for this; same for most empty fields like properties.tags; same for ruleIndex; etc.
Why changing rule names in the output, e.g. document-start to DocumentStart? These are not the real rule IDs, and will probably create confusion for users.

Commit Fix null rule for syntax error: it looks like a breaking change for the colored and github formatters, and for Python bindings that would use LintProblem object. Moreover this change doesn't look related to SARIF. Can you remove it?

In the tests, I suggest removing test_run_format_sarif_warning(). It doesn't add much value (the test_..._warning() variant is useful for the colored format, but not for others).

Finally please review contribution guidelines to make sure eveything's OK. In particular the SARIF formatter commit should explain what is SARIF and why it's a nice addition. The 2 last commits are just fixups and should be squashed in their relative commit. Tests should go in the same commit as the SARIF feature itself. I don't see any update to the documentation, but it's fine to me (github format doesn't have neither).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add SARIF support
2 participants