Skip to content

Benchmark hang with lto = true and target-cpu=native #49766

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TheIronBorn opened this issue Apr 7, 2018 · 17 comments
Closed

Benchmark hang with lto = true and target-cpu=native #49766

TheIronBorn opened this issue Apr 7, 2018 · 17 comments
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. O-macos Operating system: macOS T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@TheIronBorn
Copy link

TheIronBorn commented Apr 7, 2018

I first posted this in Cargo at rust-lang/cargo#5312

I'm benchmarking some code and often get hangs with $ RUSTFLAGS='-C target-cpu=native' cargo bench.

My Cargo.toml:

[package]
name = "test_hang"
version = "0.1.0"

[profile.bench]
lto = true

It only happens when both LTO and target-cpu=native are enabled.

A typical output looks like this, where it hangs for as long as I've left it:

$ RUST_BACKTRACE=1 RUSTFLAGS='-C target-cpu=native' cargo bench -vv
       Fresh <crates>
    Finished release [optimized] target(s) in 0.0 secs
     Running `<path to crate>target/release/deps/<crate name>-86f63a8b1891302e --bench`

running 19 tests
<ignored tests>

test result: ok. 0 passed; 0 failed; 19 ignored; 0 measured; 0 filtered out

     Running `<path to crate>/target/release/deps/bench-3dd5f8be084d72c9 --bench`

I'm using cargo 1.26.0-nightly (b70ab13b3 2018-04-04). The code requires nightly so I can't test whether this happens on stable.
$ rustc --print target-cpus says my processor is sandybridge.

Meta

$ rustc --version --verbose
rustc 1.27.0-nightly (eeea94c11 2018-04-06)
binary: rustc
commit-hash: eeea94c11d02ff62fb011d1afdda9301fdf9726b
commit-date: 2018-04-06
host: x86_64-apple-darwin
release: 1.27.0-nightly
LLVM version: 6.0
@TheIronBorn
Copy link
Author

TheIronBorn commented Apr 7, 2018

It doesn't seem to happen when using target-cpu=westmere, the preceding Intel CPU microarchitecture.

@TheIronBorn
Copy link
Author

I'm trying to find a minimal code example.

@TheIronBorn
Copy link
Author

Turns out it actually happens with a totally empty project:

├── Cargo.lock
├── Cargo.toml
├── benches
│   └── bench.rs
└── src
    └── lib.rs

@neachdainn
Copy link

I have also experienced this issue.

rustc 1.27.0-nightly (e5f80f2a4 2018-05-09)
binary: rustc
commit-hash: e5f80f2a4f016bf724a1cfb580619d71c8fd39ec
commit-date: 2018-05-09
host: x86_64-apple-darwin
release: 1.27.0-nightly
LLVM version: 6.0

@TheIronBorn
Copy link
Author

@neachdainn what is your microarchitecture?

@neachdainn
Copy link

Core i7-4770HQ (Haswell)

@TheIronBorn
Copy link
Author

Could you also test with a couple other target CPUs?

@TheIronBorn
Copy link
Author

Specifically Sandy Bridge and Westmere.

@neachdainn
Copy link

neachdainn commented May 13, 2018

Here's some tests on a different system:

rustc 1.27.0-nightly (acd3871ba 2018-05-10)
binary: rustc
commit-hash: acd3871ba17316419c644e17547887787628ec2f
commit-date: 2018-05-10
host: x86_64-unknown-linux-gnu
release: 1.27.0-nightly
LLVM version: 6.0

Core i5-2500k (Sandybridge)

I am not seeing the bug on this system, so it may be macOS specific. I'll be able to test this on my Mac on Monday.

@Mark-Simulacrum Mark-Simulacrum added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-codegen Area: Code generation labels May 28, 2018
@Mark-Simulacrum
Copy link
Member

Can someone post a minimal example with code? Or at least some example?

This is probably a case of #50422.

@TheIronBorn
Copy link
Author

TheIronBorn commented May 28, 2018

As I mentioned here #49766 (comment), it happens with a totally empty project.
bench.rs is completely empty.
main.rs has:

fn main() {
    println!("Hello, world!");
}

@Mark-Simulacrum Mark-Simulacrum added the O-macos Operating system: macOS label May 28, 2018
@Mark-Simulacrum
Copy link
Member

I've confirmed I cannot reproduce on linux. I believe the hang occurs here, but I've not tracked this down further.

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x0000000103f80550 bench-d5493d9f2fa74606`std::sys::unix::stack_overflow::imp::signal_handler::h373dcc530442aa18 (.llvm.16322965540172889217) + 112
bench-d5493d9f2fa74606`std::sys::unix::stack_overflow::imp::signal_handler::h373dcc530442aa18:
->  0x103f80550 <+112>: vmovdqa (%rax), %ymm0
    0x103f80554 <+116>: vmovaps 0x45c24(%rip), %ymm1
    0x103f8055c <+124>: vmovaps %ymm1, (%rax)
    0x103f80560 <+128>: movq   0x20(%rax), %rax
Target 0: (bench-d5493d9f2fa74606) stopped.

@TheIronBorn
Copy link
Author

Confirm it does not happen with -C lto=thin

$ RUSTFLAGS='-C target-cpu=native -C lto=thin' cargo bench

@TheIronBorn
Copy link
Author

-C lto=fat: yes
-C lto: yes

@bluejekyll
Copy link

I'm seeing a more general problem as well, where a binary running in an benchmark almost always hangs with lto, but when run independently hangs ~50% of the time. lto-thin appears to not have this issue.

I haven't investigated yet.

@XAMPPRocky XAMPPRocky added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Jul 7, 2018
@steveklabnik
Copy link
Member

I'm on Windows, but I cannot reproduce this today.

@workingjubilee
Copy link
Member

Compiling from x86_64-unknown-linux-gnu to

  • x86_64-unknown-linux-gnu
  • x86_64-apple-darwin (cross)

No repro with Ctarget-cpu=sandybridge or -Ctarget-cpu=znver3 (native).
Closing, but feel free to bring this back if it repros on another x86_64-apple-darwin host.

@workingjubilee workingjubilee closed this as not planned Won't fix, can't repro, duplicate, stale Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. O-macos Operating system: macOS T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants