Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a lint for .clone().into_iter() #13595

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

GnomedDev
Copy link
Contributor

@GnomedDev GnomedDev commented Oct 23, 2024

Closes #3302

changelog: [unnecessary_collection_clone] Add a lint for .clone().into_iter().

@rustbot
Copy link
Collaborator

rustbot commented Oct 23, 2024

r? @xFrednet

rustbot has assigned @xFrednet.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Oct 23, 2024
Copy link
Contributor

@samueltardieu samueltardieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will lint this code and the suggestion will fail:

fn main() {
    let mut v = vec![1, 2, 1, 10];
    for (i, e) in v.clone().into_iter().enumerate() {
        for j in ((i+1)..v.len()).rev() {
            if v[j] == e {
                v.remove(j);
            }
        }
    }
    assert_eq!(v, vec![1, 2, 10]);
}

Maybe restricting the lint to immutable clone() receivers would be safer.

(yes, this code is ugly, but that is not the point)

@samueltardieu
Copy link
Contributor

In fact, it will probably be necessary to check whether any immutable component exist in the receiver ref chain if any because the lifetime of the receiver will be extended for the lifetime of the loop. This one will also cause a wrong suggestion for the time being, I'll let it here so that you can check that it is handled as well:

struct S {
    a: Vec<i32>,
}

fn main() {
    let mut s = S { a: vec![1, 2, 3] };
    (&s.a).clone().into_iter().for_each(|x| s.a[0] += x);
    assert_eq!(s.a[0], 7);
}

@GnomedDev
Copy link
Contributor Author

Sorted out those issues, I think at least

@samueltardieu
Copy link
Contributor

Sorted out those issues, I think at least

I think the example in #13595 (comment) still gives an incorrect suggestion:

warning: using clone on collection to own iterated items
 --> src/main.rs:7:5
  |
7 |     (&s.a).clone().into_iter().for_each(|x| s.a[0] += x);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^ help: replace with: `(&s.a).iter().cloned()`
  |
  = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_collection_clone
  = note: `#[warn(clippy::unnecessary_collection_clone)]` on by default

Applying the suggestion doesn't compile:

error[E0502]: cannot borrow `s.a` as mutable because it is also borrowed as immutable
 --> src/main.rs:7:37
  |
7 |     (&s.a).iter().cloned().for_each(|x| s.a[0] += x);
  |     ------                 -------- ^^^ --- second borrow occurs due to use of `s.a` in closure
  |     |                      |        |
  |     |                      |        mutable borrow occurs here
  |     |                      immutable borrow later used by call
  |     immutable borrow occurs here

@GnomedDev
Copy link
Contributor Author

Okay, I'm pretty sure that issue is now fixed and I have added more tests for partial borrow situations.

@bors
Copy link
Contributor

bors commented Oct 29, 2024

☔ The latest upstream changes (presumably #13437) made this pull request unmergeable. Please resolve the merge conflicts.

@GnomedDev GnomedDev force-pushed the unnecessary-collection-clone branch 2 times, most recently from 5180ebf to b92c43d Compare October 29, 2024 12:15
@bors
Copy link
Contributor

bors commented Oct 29, 2024

☔ The latest upstream changes (presumably #13034) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@xFrednet xFrednet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, some high-level comments, and questions and that's it :D

use super::{UNNECESSARY_COLLECTION_CLONE, method_call};

// FIXME: This does not check if the iter method is actually compatible with the replacement, but
// you have to be actively evil to have an `IntoIterator` impl that returns one type and an `iter`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this comment xD

I wonder if there is actually a reasonable use case for this, but let's not open that can of worms ^^

Comment on lines +4264 to +4265
/// Cloning a collection requires allocating space for all elements and cloning each element into this new space,
/// whereas using `Iterator::cloned` does not allocate any more space and only requires cloning each element as they are consumed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this suggestion is always better. Have you benchmarked that individual clones are actually faster, or at least equivalent?

With a linked list (I know everyone's favorite structure) it might be legit better to clone it, thereby allocating all nodes close together and loading everything into cache.

This is not an argument against the lint, I still think it's a good idea, just a question about the category

Copy link
Contributor Author

@GnomedDev GnomedDev Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't benchmarked this, and I may be falling into the "it's somewhat obvious" trap. It's possible we could maintain a list of collections it would be quicker to do this, or even have a #[clippy::iter_cloned_instead_of_clone)] to mark collections that would benefit from this lint. But it will be quite the low FP rate for collections that are faster to clone than iterate and clone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this highly depends on the iterator usage. For a vector, it makes a difference of allocating a big chunk of memory once against several small allocations. For vectors it might actually be the other way around (meaning .clone().into_iter() being faster), but this is just guesstimating, since I haven't benchmarked this.

One thing that's safe to say, is that cloning each item individually will reduce the memory footprint, but only if the items are dropped and not stored somewhere afterward.

Could you try to benchmark this?

I've just checked the issue, and that one was suggestion to lint on cloned().into_iter().filter(). Linting on that should be safe again

// and the type has an `iter` method
&& has_iter_method(cx, typeck_results.expr_ty(collection_expr))
{
let mut applicability = Applicability::MachineApplicable;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is this lint trigger from Lintcheck:

Added clippy::unnecessary_collection_clone at reqwest-0.12.5/src/dns/resolve.rs:100

warning: using clone on collection to own iterated items
   --> target/lintcheck/sources/reqwest-0.12.5/src/dns/resolve.rs:100:45
    |
100 |                 let addrs: Addrs = Box::new(dest.clone().into_iter());
    |                                             ^^^^^^^^^^^^^^^^^^^^^^^^ help: replace with: `dest.iter().cloned()`
    |
    = note: `--force-warn clippy::unnecessary-collection-clone` implied by `--force-warn clippy::all`

These cases might not be MachineApplicable as they change the type of the expression. To fully avoid FPs, we probably would need to track how the value is finally used, which would be hard and also prone to errors. I think this is an argument for changing the Applicability to MaybeIncorrect and having it in pedantic instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's a shame, I hate relegating useful lints to pedantic where they won't be hit by the beginners this lint is targeting.

Would it be good if I swap the suggestion to MaybeIncorrect but keep in the perf category? Since we now have lint reasons, it would be good for a codebase to document "we don't do Iter::cloned due to XYZ" with an allow/expect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The applicability is only given to tools like rustfix and not directly visible to the user. So changing the applicability, shouldn't have an effect on the lint category.

The category usually depends on the FP rate and the trigger rate when we introduce the lint. If a lint is reasonable, but has too many lint triggers, it will sadly still often land in pedantic. We can leave the lint group discussion open for the RFC. But the applicability should still be changed.

@bors
Copy link
Contributor

bors commented Nov 13, 2024

☔ The latest upstream changes (presumably b829d53) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lint vec.clone().into_iter().stuff().collect() to suggest vec.iter().stuff().cloned().collect()
5 participants