Skip to content

copied().min() performance regression on nightly #140207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
DaniPopes opened this issue Apr 23, 2025 · 6 comments
Open

copied().min() performance regression on nightly #140207

DaniPopes opened this issue Apr 23, 2025 · 6 comments
Labels
C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@DaniPopes
Copy link
Contributor

DaniPopes commented Apr 23, 2025

Code

I tried this code (godbolt, from this video):

#[no_mangle]
pub fn f(a: &[u16; 8]) -> u16 {
    a.iter().copied().min().unwrap()
}

I expected to see this happen: compiles to vphminposuw x86 instruction (+avx feature)

Instead, this happened: compiles to unrolled cmp + cmov

Note that a manual min loop is unaffected, even with the slice iterator:

#[no_mangle]
pub fn f(a: &[u16; 8]) -> u16 {
    let mut min = u16::MAX;
    for &x in a {
        if x < min {
            min = x;
        }
    }
    min
}

Version it worked on

It most recently worked on: 1.86

Version with regression

rustc --version --verbose:

rustc 1.88.0-nightly (6bc57c6bf 2025-04-22)
binary: rustc
commit-hash: 6bc57c6bf7d0024ad9ea5a2c112f3fc9c383c8a4
commit-date: 2025-04-22
host: x86_64-unknown-linux-gnu
release: 1.88.0-nightly
LLVM version: 20.1.2

@rustbot modify labels: +regression-from-stable-to-nightly -regression-untriaged


searched nightlies: from nightly-2025-02-15 to nightly-2025-04-23
regressed nightly: nightly-2025-04-16
searched commit range: 2da29db...38c560a
regressed commit: 58c2dd9

Weirdly: #139745 cc @thaliaarchi

bisected with cargo-bisect-rustc v0.6.9

Host triple: x86_64-unknown-linux-gnu
Reproduce with:

cargo bisect-rustc --start 1.86.0 -vv --without-cargo --script ./asm.sh 
@DaniPopes DaniPopes added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Apr 23, 2025
@rustbot rustbot added I-prioritize Issue: Indicates that prioritization has been requested for this issue. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. and removed regression-untriaged Untriaged performance or correctness regression. labels Apr 23, 2025
@thaliaarchi
Copy link
Contributor

Thanks for the bisect. It indeed looks to be caused by my patch. As the linked video indicates, this instruction is limited to min of [u16; 8] and not generalizable. Are there other affected cases?

This case is the oppose of what I was attempting to optimize Copied<T> for—where the iterated data is smaller than a pointer.

For Cloned<T>, perhaps we should specialize on T: TrivialClone (#135634, cc @joboet) to match whatever Copied<T> is changed to do.

@saethlin saethlin added I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Apr 23, 2025
@the8472
Copy link
Member

the8472 commented Apr 23, 2025

Are there other affected cases?

u8, u32, usize no longer vectorize either, probably others too. They just use(ed) different vector instructions.

It's strange that #139745 got merged considering that my similar proposal to elide side-effects in #90209 got rejected.

@joboet
Copy link
Member

joboet commented Apr 24, 2025

@rustbot label +I-libs-nominated

I merged #139745 assuming that the precise behaviour around side-effects of the iterator methods was an implementation detail, but that should probably have been discussed beforehand. Also, it was the cause of this performance regression. T-libs, do you want to revert this change?

@rustbot rustbot added the I-libs-nominated Nominated for discussion during a libs team meeting. label Apr 24, 2025
@Amanieu
Copy link
Member

Amanieu commented Apr 30, 2025

We discussed this in the @rust-lang/libs meeting. We would like #139745 to be reverted for now and re-submitted with a libs-api nomination so that we can properly discuss the exact implications.

I believe the reason this optimizes differently is that before #139745 the compare closure was passed references to copies of the values yielded by the iterator, whereas #139745 changes this to pass the original references yielded by the iterator instead.

@Amanieu Amanieu added P-high High priority and removed I-libs-nominated Nominated for discussion during a libs team meeting. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Apr 30, 2025
@thaliaarchi
Copy link
Contributor

By the way, I am not planning to submit the PRs for the revert or selective reintroduction, even though it's my bug. Sorry. The revert should be easy, though. I'm currently not looking to add any more PRs to my current stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants