Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a simplified machine parsable output for reuse lint #925

Open
nogweii opened this issue Feb 28, 2024 · 9 comments
Open

Add a simplified machine parsable output for reuse lint #925

nogweii opened this issue Feb 28, 2024 · 9 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@nogweii
Copy link

nogweii commented Feb 28, 2024

A middle ground between the full JSON output and the human-focused plain output. My idea here is integrating reuse lint with tools like reviewdog where it excels with output along the lines of:

scripts/foo1.sh: missing license 'MPL-2.0'
scripts/foo2.sh: no licensing information
scripts/foo3.sh: no copyright information

One can generate such an output using the JSON version and processing it, but I think it would be beneficial to include it upstream.

@KlfJoat
Copy link

KlfJoat commented Mar 6, 2024

+1 to this. I'd also suggest allowing us to pick from a few different output formats. And those output formats should be existing output formats.

One of the output formats that's simplified and machine-parseable could be TAP format. Since TAP parsers abound, the direct output of reuse lint could be used in CI/CD pipelines.

Another format, which wouldn't be simplified and human-readable, but is widely used could be JUnit format, which also has a lot of parsers and could be consumed by CI/CD pipelines, including built-in reporting on sites like GitLab.

@carmenbianca carmenbianca added the enhancement New feature or request label Apr 8, 2024
@carmenbianca
Copy link
Member

Thanks for the issue. This seems quite doable to me. Off the top of my head, an implementation would need:

  • A new function akin to format_plain and format_json in lint.py.
  • A new argument that can be passed to reuse lint, à la reuse lint --tap or some such.
  • --json and --tap must be mutually exclusive. argparse already supports something like that.
  • Some tests.

The TAP format looks decent to me as far as machine-readable specs go. A vague proposal:

1..n
ok - ./CHANGELOG.md
ok - ./README.md
not ok - ./pyproject.toml
  ---
  failed:
    - no-copyright
    - no-license
  ...
# global tests
ok - no missing licenses
ok - no unused licenses
ok - no bad licenses
not ok - no licenses without extension
  ---
  failed:
    - ./LICENSES/MIT
  ...
ok - no read errors

The YAML is a bit verbose, but there's no other way to document why the test failed.

Figuring out the value of n in advance may be a little tricky. I suppose it's number of files (computed in advance) + number of global tests (static).

@carmenbianca carmenbianca added the good first issue Good for newcomers label Apr 8, 2024
@carmenbianca
Copy link
Member

I also had a look at reviewdog linked in the first comment. The README is very shouty/busy so I didn't make good progress skimming through it. Would the above work for reviewdog?

@nicorikken
Copy link
Member

nicorikken commented Apr 8, 2024

To elaborate on the reviewdog example, here is suggested input for reviewdog in the video from the README:
image

Details of accepting input formats and an example are described in the Input Format section.

{file}:{line number}:{column number}: {message} is suggested.

The definitions are constructed according to the Vim errorformat.

Another option for different uses might be the SARIF format. This, together with the JUnit.xml were already mentioned in an earlier issue: #320

@nicorikken
Copy link
Member

nicorikken commented Apr 8, 2024

One question is how to deal with global errors like missing licenses.

  • Maybe .: Missing MIT license in LICENSES/ would be an option, so . as the filename.
  • Maybe LICENSES: Missing MIT license in LICENSES/ to refer to the licenses directory.

@nicorikken nicorikken self-assigned this Apr 8, 2024
@KlfJoat
Copy link

KlfJoat commented Apr 8, 2024

IMO, a missing license is a file-specific error. The license referenced by somefile.ext does not exist in LICENSES/. If a line number is needed, it's the line where the license is referenced.

nicorikken added a commit that referenced this issue Apr 10, 2024
Adds an '-line' or '-l' option to the 'lint' command. Prints a line for
each error, starting with the file to which the error belongs.

This output can be a starting point for some parsers, in particular ones
that implement something similar to Vim errorformat parsing.

Related to #925

Additional work needed:

- [ ] Needs tests
- [ ] Error messages might have to be improved
- [ ] Not all errors are found, as some issues aren't in the
  FileReports. This requires additional investigation.

Signed-off-by: Nico Rikken <[email protected]>
@nicorikken

This comment was marked as resolved.

@nicorikken

This comment was marked as resolved.

@nicorikken
Copy link
Member

nicorikken commented Apr 22, 2024

@nogweii @KlfJoat I have a working PR at #956 Can you please take a look to see if it meets your expectations?
I'm trying to run the output via reviewdog, but am not successful. Perhaps I'm doing something wrong.

nicorikken added a commit that referenced this issue Apr 22, 2024
Adds an '-line' or '-l' option to the 'lint' command. Prints a line for
each error, starting with the file to which the error belongs.

This output can be a starting point for some parsers, in particular ones
that implement something similar to Vim errorformat parsing.

Related to #925

Additional work needed:

- [ ] Needs tests
- [ ] Error messages might have to be improved
- [ ] Not all errors are found, as some issues aren't in the
  FileReports. This requires additional investigation.

Signed-off-by: Nico Rikken <[email protected]>
carmenbianca pushed a commit that referenced this issue May 28, 2024
Adds an '-line' or '-l' option to the 'lint' command. Prints a line for
each error, starting with the file to which the error belongs.

This output can be a starting point for some parsers, in particular ones
that implement something similar to Vim errorformat parsing.

Related to #925

Additional work needed:

- [ ] Needs tests
- [ ] Error messages might have to be improved
- [ ] Not all errors are found, as some issues aren't in the
  FileReports. This requires additional investigation.

Signed-off-by: Nico Rikken <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants