new lint to detect inefficient `iter().any()` #13817

lapla-cogito · 2024-12-12T01:13:43Z

Using contains() for numeric slices are more efficient than using iter().any().

changelog: [slice_iter_any]: new lint

rustbot · 2024-12-12T01:13:49Z

rustbot has assigned @Manishearth.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

clippy_lints/src/contains_for_slice.rs

lapla-cogito · 2024-12-12T02:35:36Z

On second thought, I thought it would be more appropriate to place the lint under the methods directory. Therefore, I am sorry, but ~~I'm going to convert to draft this PR and modify it.~~ now fixed

clippy_lints/src/methods/contains_for_slice.rs

CHANGELOG.md

samueltardieu · 2024-12-13T17:14:50Z

clippy_lints/src/slice_iter_any.rs

+    /// Checks for usage of `iter().any()` on slices of `u8` or `i8` and suggests using `contains()` instead.
+    ///
+    /// ### Why is this bad?
+    /// `iter().any()` on slices of `u8` or `i8` is optimized to use `memchr`.


Don't you mean .contains()? Anyway, I think this is too specific, on my machine it seems to use compiler intrinsics and make no calls to memchr.

You're right about performances though: I benchmarked both .iter().any(==) and .contains() and the latter runs more than 10 times faster on large areas.

Oh, I had assumed from the implementation that memchr would be used in any environment (including my local environment). Also, this godbolt does so as far as the assembly is concerned: https://rust.godbolt.org/z/Kxrzr81Me

However, if it may differ depending on the environment, the description should be indeed modified, so I'll make a change. Could you please tell me for reference, what's the environment you have checked that?

I think the memchr itself has been replaced by an intrinsic. I'm using the latest nightly compiler on x86_64.

Thank you! In any case, it looks like this lint description should be modified.

I think the memchr() exists for larger integer types too, yes?

oh, nope, it doesn't. It could be extended, I think.

I found that contains() is now faster for certain types (u16, u32, u64, i16, i32, i64, f32, f64, usize, isize) compared to before (see: rust-lang/rust#130991). Therefore, this performance lint should be extended to cover these types starting from Rust 1.84.0 and I'll make a change for this.

clippy_lints/src/slice_iter_any.rs

samueltardieu · 2024-12-13T17:18:23Z

tests/ui/slice_iter_any.rs

+    let vec: Vec<u32> = vec![1, 2, 3, 4, 5, 6];
+    let values = &vec[..];
+    let _ = values.iter().any(|&v| v == 4);
+    // no error, because it's not a slice of u8/i8


Wouldn't it be more readable to use .contains() here too? I thought that was what @Manishearth was suggesting.

samueltardieu · 2024-12-13T17:19:00Z

tests/ui/slice_iter_any.rs

+
+    let values: [u8; 6] = [3, 14, 15, 92, 6, 5];
+    let _ = values.iter().any(|&v| v == 10);
+    // no error, because it's an array


Wouldn't it be more efficient to lint there, even though this is an array?

The optimization doesn't seem to work for arrays. Because of the implementation I mentioned in this.

But isn't it clearer anyway to use .contains() rather than .any(==)?

If you are suggesting that clippy should modify this code style, you may be right. I'll try to make changes.

However, after thinking about it, what this lint should do is to suggest performance improvements for u8 and i8 slices, and it seems appropriate to make this as a separate lint. What do you think? If this is a good idea, I'll implement this as a new lint in another PR.

Wouldn't a lint which is more efficient for some types (u8/i8) and not less efficient for some others deserve to be in the performance category? And by the way, I see the same 10+ performance boost in u32 as well.

It looks like a second lint would also cover this one, I'm not sure two lints are needed. I'll let others weigh in.

I see the same 10+ performance boost in u32 as well

I didn't check about it. Thank you very much. If so, it seems more reasonable to combine them into one as a single lint.

In case you don't have one, here is the one I used, for testing func1 and func2, the two versions I wanted to compare.

lapla-cogito · 2024-12-14T02:25:06Z

clippy_lints/src/methods/iter_any.rs

+            ty::Ref(_, inner_type, _) if inner_type.is_slice() => {
+                // check if the receiver is a u8/i8 slice
+                if let ty::Slice(slice_type) = inner_type.kind()
+                    && (slice_type.to_string() == "u8" || slice_type.to_string() == "i8")


In my environment, I could only see the speedup for the u8 and i8 slices (about 5~7x) while @samueltardieu says he has been able to confirm this with other types of slices, so the changes in this PR are only for these types.
At least the speedups for the u8 and i8 slices are correct for reasons that come from the Rust implementation, I think.

clippy_lints/src/methods/iter_any.rs

Manishearth

A thing I'm unsure about is the mixing of two lints like this. I'm going to ask on Zulip if we should be doing two lints here.

https://rust-lang.zulipchat.com/#narrow/channel/257328-clippy/topic/iter.2Eany.28.29.20lint.3A.20one.20or.20two.20lints.3F

Manishearth · 2024-12-19T02:10:04Z

clippy_lints/src/methods/iter_any.rs

+    && let Some((name, recv, _, _, _)) = method_call(recv)
+    && name == "iter"
+    {
+        let ref_type = cx.typeck_results().expr_ty(recv);


issue: Probably should be expr_ty_adjusted to handle autoderef. Add a test that ensures this works on a vector vec.iter().any(...).

Thank you! I changed to use expr_ty_adjusted and added a test for this in e964cbc.

lapla-cogito

@Manishearth TBH, I added style lint (unnecessary_iter_any) through the review process, which I consider unnecessary at least for this change.
Certainly there will be some cases where replacing iter().any() with contains() will be more readable, but I think it could also lead to false positives. Not only that, I found that adding the unnecessary_iter_any lint required a lot of modifications in the existing code base of Clippy.
Therefore, I think it is appropriate to limit this change to adding slice_iter_any lint for slices of numeric slices. What do you think?

edit: What I meant is e964cbc. If the changes that follow this cource are acceptable, I'll squash the previous commits as appropriate.

lapla-cogito · 2024-12-22T05:05:15Z

clippy_lints/src/methods/slice_iter_any.rs

+fn can_replace_with_contains(op: Spanned<BinOpKind>, lhs: &Expr<'_>, rhs: &Expr<'_>) -> bool {
+    matches!(
+        (op.node, &lhs.kind, &rhs.kind),
+        (
+            BinOpKind::Eq,
+            ExprKind::Path(_) | ExprKind::Unary(_, _),
+            ExprKind::Lit(_) | ExprKind::Path(_)
+        ) | (
+            BinOpKind::Eq,
+            ExprKind::Lit(_),
+            ExprKind::Path(_) | ExprKind::Unary(_, _)
+        )
+    )
+}


For example, the following code uses == inside closures, but cannot simply be replaced by contains():

let _ = values.iter().any(|&v| v % 2 == 0);

Therefore, this function exclude such cases.

Manishearth · 2024-12-23T20:18:20Z

What do you think?

I think that we can add a single numeric-only lint to the nursery but we should have the clippy team figure out whether we wish to

add two lints (one perf, one style)
add a single perf lint with a config for how expansive it is
add a single perf lint that is expansive
add a single perf lint that is numeric-only

before the numeric lint is moved out of the nursery. Currently 2 and 3 are options that would be made harder by adding a numeric-only lint outside of the nursery. Clippy is allowed to expand lints after release but when it does so it can be annoying to people so we try to limit that.

rustbot assigned Manishearth Dec 12, 2024

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Dec 12, 2024

lapla-cogito force-pushed the contains_for_u8i8 branch 2 times, most recently from acb27cb to 8eb9d35 Compare December 12, 2024 01:30

lapla-cogito commented Dec 12, 2024

View reviewed changes

clippy_lints/src/contains_for_slice.rs Outdated Show resolved Hide resolved

lapla-cogito marked this pull request as draft December 12, 2024 02:35

lapla-cogito force-pushed the contains_for_u8i8 branch from 8eb9d35 to 14694b1 Compare December 12, 2024 02:55

lapla-cogito marked this pull request as ready for review December 12, 2024 02:59

new lint to use contains() instead of iter().any() for u8 and i8 slices

c52ebf2

lapla-cogito force-pushed the contains_for_u8i8 branch from 14694b1 to c52ebf2 Compare December 12, 2024 08:49

Manishearth reviewed Dec 12, 2024

View reviewed changes

clippy_lints/src/methods/contains_for_slice.rs Outdated Show resolved Hide resolved

Manishearth reviewed Dec 12, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

rename contains_for_slice to slice_iter_any

c1d5108

lapla-cogito force-pushed the contains_for_u8i8 branch from 149e701 to c1d5108 Compare December 12, 2024 23:45

samueltardieu reviewed Dec 13, 2024

View reviewed changes

clippy_lints/src/slice_iter_any.rs Outdated Show resolved Hide resolved

samueltardieu reviewed Dec 13, 2024

View reviewed changes

lapla-cogito force-pushed the contains_for_u8i8 branch 2 times, most recently from fe5f2c4 to 041f76b Compare December 14, 2024 01:49

correct description for slice_iter_any

b3a1693

lapla-cogito force-pushed the contains_for_u8i8 branch 3 times, most recently from 2b9d742 to b3837dc Compare December 14, 2024 02:10

lapla-cogito changed the title ~~new lint to use contains() instead of iter().any() for u8 and i8 slices~~ new lints to detect inefficient iter().any() Dec 14, 2024

lapla-cogito commented Dec 14, 2024

View reviewed changes

lapla-cogito force-pushed the contains_for_u8i8 branch from b3837dc to a956f34 Compare December 14, 2024 02:30

lapla-cogito commented Dec 14, 2024

View reviewed changes

clippy_lints/src/methods/iter_any.rs Outdated Show resolved Hide resolved

add lint for checking unnecessary iter().any()

19357ca

lapla-cogito force-pushed the contains_for_u8i8 branch from a956f34 to 19357ca Compare December 14, 2024 02:45

lapla-cogito requested a review from Manishearth December 15, 2024 13:34

Manishearth reviewed Dec 19, 2024

View reviewed changes

lapla-cogito commented Dec 22, 2024

View reviewed changes

lapla-cogito force-pushed the contains_for_u8i8 branch from f67d408 to d54475b Compare December 22, 2024 04:38

add slice_iter_any lint

e964cbc

lapla-cogito force-pushed the contains_for_u8i8 branch from d54475b to e964cbc Compare December 22, 2024 04:55

lapla-cogito changed the title ~~new lints to detect inefficient iter().any()~~ new lint to detect inefficient iter().any() Dec 22, 2024

lapla-cogito requested a review from Manishearth December 22, 2024 05:01

lapla-cogito commented Dec 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new lint to detect inefficient `iter().any()` #13817

new lint to detect inefficient `iter().any()` #13817

lapla-cogito commented Dec 12, 2024 •

edited

Loading

rustbot commented Dec 12, 2024

lapla-cogito commented Dec 12, 2024 •

edited

Loading

samueltardieu Dec 13, 2024

lapla-cogito Dec 13, 2024 •

edited

Loading

samueltardieu Dec 13, 2024

lapla-cogito Dec 13, 2024

Manishearth Dec 19, 2024

Manishearth Dec 19, 2024

lapla-cogito Dec 21, 2024

samueltardieu Dec 13, 2024

samueltardieu Dec 13, 2024

lapla-cogito Dec 13, 2024

samueltardieu Dec 13, 2024

lapla-cogito Dec 13, 2024 •

edited

Loading

lapla-cogito Dec 13, 2024 •

edited

Loading

samueltardieu Dec 13, 2024

lapla-cogito Dec 13, 2024

samueltardieu Dec 13, 2024

lapla-cogito Dec 14, 2024 •

edited

Loading

Manishearth left a comment •

edited

Loading

Manishearth Dec 19, 2024

lapla-cogito Dec 22, 2024

lapla-cogito left a comment •

edited

Loading

lapla-cogito Dec 22, 2024

Manishearth commented Dec 23, 2024

new lint to detect inefficient iter().any() #13817

Are you sure you want to change the base?

new lint to detect inefficient iter().any() #13817

Conversation

lapla-cogito commented Dec 12, 2024 • edited Loading

rustbot commented Dec 12, 2024

lapla-cogito commented Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

lapla-cogito Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lapla-cogito Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

lapla-cogito Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lapla-cogito Dec 14, 2024 • edited Loading

Choose a reason for hiding this comment

Manishearth left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lapla-cogito left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Manishearth commented Dec 23, 2024

new lint to detect inefficient `iter().any()` #13817

new lint to detect inefficient `iter().any()` #13817

lapla-cogito commented Dec 12, 2024 •

edited

Loading

lapla-cogito commented Dec 12, 2024 •

edited

Loading

lapla-cogito Dec 13, 2024 •

edited

Loading

lapla-cogito Dec 13, 2024 •

edited

Loading

lapla-cogito Dec 13, 2024 •

edited

Loading

lapla-cogito Dec 14, 2024 •

edited

Loading

Manishearth left a comment •

edited

Loading

lapla-cogito left a comment •

edited

Loading