Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the need to use register_tool, and add it automatically. #283

Merged

Conversation

Veetaha
Copy link
Contributor

@Veetaha Veetaha commented Oct 8, 2023

This is a bit of a hack, because rustc's code wasn't designed for such a use case, so the code looks a bit ugly, but it works.

The solution is to add a query override for registered_tools that adds marker automatically.

I updated the book, but I figured out that the book needs to be updated only after the release. I wonder if we should make the book deployment on release workflow and not on each push to master?

@Veetaha Veetaha requested a review from xFrednet October 8, 2023 17:44
Comment on lines 79 to 81
// Make it possible to use `#[allow(marker::{lint_name})]` without
// having to add `#![feature(register_tool)]` and `#![register_tool(marker)]`.
config.override_queries = Some(|_sess, providers, _extern_providers| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even know that it was possible to override quarries o.O.

In red pen, I found another hack, that adds the attributes via console arguments. Have you considered that as well?

https://github.com/estebank/redpen/blob/b4008d2092ff4fdbb3f735d1892a88a63c956d97/commands/wrapper.rs#L39-L40

I'm not sure if that's more or less sustainable than this option.

The best would be to have an official interface in rustc for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that may be a good idea. I wonder if it could be configured at workspace level, if that's a cli option it could be part of rustflags, I'll think what it may give us

Copy link
Contributor Author

@Veetaha Veetaha Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that we could have a thin rustc wrapper script e.g. cargo marker rustc that users could put into RUSTC_WRAPPER to avoid writing cfg_attr and any other ugly boilerplate. That wrapper would add the -Zcrate-attr=feature(register_tool), and -Zcrate-attr=register_tool(marker) automatically to any rustc invocation. But.. there is a problem with this approach:

-Zcrate-attr is an unstable option, so users can't have it on stable toolchain. We could potentially add RUSTC_BOOTSTRAP=1 env variable to workaround that, but that env variable affects the entire rustc CLI invocation, meaning users can now use any unstable options, because our wrapper script adds that env variable for them. That's quite ugly..

So for having this inside of the marker rustc driver only and requiring cfg_attr is the best option for now.

@xFrednet
Copy link
Member

xFrednet commented Oct 9, 2023

I wonder if we should make the book deployment on release workflow and not on each push to master?

That might be good and would avoid confusion in the future. 👍

@Veetaha
Copy link
Contributor Author

Veetaha commented Oct 10, 2023

I changed the approach to use the CLI options. I actually supposed there could be an option like this, but I failed to find it in rustc docs and help messages. The problem is that I didn't run rustc -Z help to see the help message for the unstable options.
But luckily, you pointed out that it exists, and it looks much better than a crutch query override with a thread-local.

@Veetaha Veetaha requested a review from xFrednet October 10, 2023 17:49
Copy link
Member

@xFrednet xFrednet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good to me, but parts of the documentation are confusing to me 😅

docs/book/src/usage/setting-lint-levels.md Outdated Show resolved Hide resolved
docs/book/src/usage/setting-lint-levels.md Outdated Show resolved Hide resolved
@xFrednet
Copy link
Member

I wonder if we should make the book deployment on release workflow and not on each push to master?

I've crated #285 to only deploy the book on tags or when it was manually triggered :)

@Veetaha
Copy link
Contributor Author

Veetaha commented Oct 11, 2023

I've been trying to preserve the previous sentences from the documentation, and as you consequently noticed it made it repetitive and confusing.

Instead, I rewrote the document and also included the requirement of adding the crate name into the lint name which is going to be part of #288


Additionaly, I didn't mention --cfg=marker="lint_crate" in the docs. I can't find a real use case where one would want to use it. The example that was already in the docs there is already covered by simple cfg_attr(marker). I'm thinking we should actually remove this --cfg to simplify the API and the docs as well.

WDYT @xFrednet. How do you feel about removing --cfg=marker="lint_crate"?

@xFrednet
Copy link
Member

WDYT @xFrednet. How do you feel about removing --cfg=marker="lint_crate"?

I feel rather strongly about keeping it. This cfg is targeted towards lint crates, which can be time intensive, like architecture checks. In those cases, users would probably not want to run them on their local machine before committing, but let the CI handle it. With this feature, it's possible to only apply the attribute if the lint crate is actually loaded.

I can also imagine that there could be lints for specific circumstances, like releases. Clippy has a few settings/lints which are allow-by-default, but can be worth checking before a release. This could also be a thing in Marker, with the caveat, that these lints/lint crates could be in a separate lint crate to safe performance.

If we don't provide a way to check for lint crate, it would mean that users can't set those lint levels at all or that they have to manually add a feature to check modify the lint levels again.

@Veetaha
Copy link
Contributor Author

Veetaha commented Oct 12, 2023

I think heavy lints will be "allow" by default, and then on CI their level will be overridden with RUSTFLAGS="--warn {lint_name}".
Could you show a specific example which one couldn't solve nicely without having the --cfg=marker="{lint_crate}"?

To my mind people will have a single cargo marker CI job with a bunch of heavy lints enabled via RUSTFLAGS. Locally they will be ignored because they are "allow" by default. So there will never be a case of "unknown lint" warning.

@xFrednet
Copy link
Member

The lint level doesn't prevent the lint logic from being run. Rustc still calls every lint pass, regardless of if the lint is currently allowed or not. There is some discussion going on, if this can be optimized, but there are several edge cases which make this difficult. For example, a lint could be emitted at a different node than the one that is currently being checked. Marker has it especially difficult, since we currently force all lints to be implemented in the same lint pass.

Users can of course, always include a check, if the lint is currently allowed, but that adds more burden on them. Then there are also lint expectations, using the #[expect] attribute, which require the lint emission, even if it's not shown to the user.

I assume that architecture lints also first need to collect a lot of information, to then do one final analysis at the end. Meaning that not running any logic in them, if they are allowed, might be unexpected behavior.

All this is to say, that having a lint be allowed, still makes it run and doesn't save the CPU/Memory usage.


I once worked at a company, where we had some architecture lints using ArchUnit for Java. I would like to have something similar built on top of Marker later down the line. But such checks require building an entire call graph, dependency graph... Maybe even across crate boundaries.

This is a case, where it's better to only run the lints in the CI. The lints have a low hit rate, since they're quite specific or only focus on a subset of structs. Running them on the user dev machines, will result in nothing in 99% of the cases. Running them on CI if any .rs files have been changed will still have a low hit rate, but won't slow down the end users.

So this would be an example, where I would only load the lint crate in the CI. Does this example make sense? I'm happy to give some more details or provide other examples :)


We might be able to add an option to Marker, to only load the lint names, from a lint crate without actually running them. But IDK, if that's the best solution and even then, if that would cover every use case.

@Veetaha
Copy link
Contributor Author

Veetaha commented Oct 12, 2023

I added the docs --cfg=marker="lint_crate". Although, while writing them and thinking about this I found a potential problem with the --lints CLI parameter in light of #251. If we allow specifying lints in the CLI then how do these lints get into the Marker.lock? Maybe there won't be a problem here if we use the lock file of the lint itself and not have a Marker.lock. That is a good argument against Marker.lock

@Veetaha Veetaha force-pushed the feat/register-marker-tool-automatically branch from d13dae2 to 1a967f9 Compare October 12, 2023 20:45
@Veetaha
Copy link
Contributor Author

Veetaha commented Oct 12, 2023

I rebased after #288. You may check the changes after the commit Add docs about --cfg=marker="lint_crate"

@xFrednet
Copy link
Member

xFrednet commented Oct 12, 2023

With the current behavior of --lints, which only loads the lint crate specified via console args, it can cause the unknown_lints lint to trigger from rustc. I wish there was a way to easily intercept that lint from our side, but it doesn't seem that way, from the rustc code I checked. If Marker gains more traction, we should be able to add some special handling to rustc for it. A hacky solution we could do, is just allow the lint, when running marker, and have our own unknown_lints version which is lint crate aware. Then we could remove the --cfg=marker="{lint_crate}" again 🤔 This might fail if users have a #[warn(unknown_lints)] attribute in their code, but I've never seen that, so it should be fine.

I actually like this idea a lot. One question would be, where we store buildin lints (Maybe a lint crate or marker_adapter?) but that should not be a blocker.

Do you want to remove the added section in favor of this solution, or wait for the implementation instead? Note that such a lint implementation is currently blocked, since attributes are not supported yet.


For Marker.lock, you're right that this could be a problem. I have two ideas, which both feel hacky, but they might be a good inspiration:

  1. Compile each lint crate individually. And combine them all somehow, for example in a directory or by concatenating the Cargo.lock files in a Marker.lock file. This would support command line lints as well. However, it would require us to manually separate the lock file part for every compilation and will make it difficult to run simple cargo comments on the dummy fetch crate.
  2. Store the Marker.lock file for command line lints separately, maybe under a global ~/.marker folder?

Copy link
Member

@xFrednet xFrednet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think we should have the last paragraph about the lint logic being run regardless of the lint level in this documentation, even if we decide to use a custom unknown_lints lint for Marker instead :)

@xFrednet
Copy link
Member

I've moved our Marker.lock comments to #251 to have them all in one place :)

@Veetaha
Copy link
Contributor Author

Veetaha commented Oct 12, 2023

Do you want to remove the added section in favor of this solution, or wait for the implementation instead? Note that such a lint implementation is currently blocked, since attributes are not supported yet.

Yeah the current --cfg=marker="{lint_crate}" looks a bit ugly, but at least not magical. I would like to keep it in the docs at least because I just want to get this PR merged 😅. We can decide and change how it works separately. So I'm going to merge it.


As for the problem of triggering unknown_lints. Overriding the default unknown_lints lint and providing a custom impl seems like it would be too much of a hack. Besides... what will this custom impl do? Will it never report an unknown lint warning? It is still quite a nice feature to be warned if you made a typo in the lint name.

Maybe a lesser hack could be having a specialized feature for lint "profiles". This is similar to cargo build profiles or cargo-nextest profiles.

Users could define several profiles that would extend the default list of lints. They will then be able to select additional lints configured in Cargo.toml metadata using a --profile {profile_name} argument. This solution will then deprecate the --lints CLI parameter to avoid problems with Marker.lock as well.

Having everything specified in a config file allows Marker to know which lint crates the project may ever expect to be run with. So when marker is run with the default profile that doesn't include heavy lints Marker will still know that there are other lint crates that could participate in linting, and it could provide noop lint implementations for lints from those crates.

One problem with this is that Marker would need to compile all lint crates specified in the config including lint crates from other profiles to just list the lints that they have. Maybe if lint crates also had some metadata that could list their lints without compiling them this could solve this problem.

@Veetaha Veetaha added this pull request to the merge queue Oct 12, 2023
Merged via the queue into rust-marker:master with commit bb3b874 Oct 12, 2023
23 checks passed
@Veetaha Veetaha deleted the feat/register-marker-tool-automatically branch October 12, 2023 22:03
@xFrednet
Copy link
Member

xFrednet commented Oct 12, 2023

As for the problem of triggering unknown_lints. Overriding the default unknown_lints lint and providing a custom impl seems like it would be too much of a hack. Besides... what will this custom impl do? Will it never report an unknown lint warning? It is still quite a nice feature to be warned if you made a typo in the lint name.

The lint would check lint level attributes. With #288 we always know where a lint comes from just by looking at the identifier. The lint would basically do the following:

  1. Ignore any lint, which doesn't have the marker:: prefix. Rustc can handle that during normal compilation.
  2. If it encounters a lint with the marker:: prefix, it checks if the specified lint crate is loaded.
    • If it's not loaded everything is fine, and we just accept the name, without any comment, since it might be in the lint crate
    • If the lint crate is loaded, we check if the name is included in the list of registered lints, we got from the lint crate during startup. If it's included, awesome. Otherwise, we emit a warning with the marker::unknown_lint lint.

Having lint profiles sounds like a good and flexible idea 👍

I copied that part to the Marker.lock issue as well :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants