You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After enabling the new pass manager in rustc, one particular crate showed a large increase in compile time, but only when targeting s390x. I attached the bitcode from rustc, and I can reproduce the problem with opt -O1 on main.
There are two instances of that longest 4.5M line length, and both are huge phi nodes just before invoking a function ending in 17hde8a472161ebd31bE (so you can search for that). The predecessors look like a mass of loopexit.split-lp expansions gone rogue.
The same input behaves reasonably with the old PM, both optimizing faster and resulting in a smaller bitcode.
@nikic pointed out that SystemZ has an inlining threshold multiplier of 3, which would explain why we didn't see this on other targets. rust-lang/rust#89609 (comment)
Extended Description
xref: rust-lang/rust#89609
After enabling the new pass manager in rustc, one particular crate showed a large increase in compile time, but only when targeting s390x. I attached the bitcode from rustc, and I can reproduce the problem with opt -O1 on main.
$ llvm-dis <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc | wc -lL
157863 2306
$ time opt <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc -O1 -S | wc -lL
794414 4515800
real 2m4.580s
user 2m4.675s
sys 0m0.221s
There are two instances of that longest 4.5M line length, and both are huge phi nodes just before invoking a function ending in 17hde8a472161ebd31bE (so you can search for that). The predecessors look like a mass of loopexit.split-lp expansions gone rogue.
The same input behaves reasonably with the old PM, both optimizing faster and resulting in a smaller bitcode.
$ time opt <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc -O1 -S --enable-new-pm=0 | wc -lL
148220 2306
real 0m2.266s
user 0m2.322s
sys 0m0.024s
If you edit out the "target-cpu"="z10" attributes in the IR, this reproducer can also be forced to -mtriple x86_64, and that runs fine with new PM.
The text was updated successfully, but these errors were encountered: