Skip to content

1.30 beta.7 fails to build on ppc64el, "compiler unexpectedly panicked. this is a bug." #54545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
infinity0 opened this issue Sep 25, 2018 · 9 comments
Labels
C-bug Category: This is a bug. O-PowerPC Target: PowerPC processors regression-from-stable-to-stable Performance or correctness regression from one stable version to another.

Comments

@infinity0
Copy link
Contributor

See https://buildd.debian.org/status/fetch.php?pkg=rustc&arch=ppc64el&ver=1.30.0%7Ebeta.7%2Bdfsg1-1%7Eexp1&stamp=1537786559&raw=0

query stack during panic:
#0 [evaluate_obligation] evaluating trait selection obligation `isize: marker::Freeze`
#1 [is_freeze_raw] computing whether `isize` is freeze
   --> libcore/cmp.rs:300:12
    |
300 |     Less = -1,
    |            ^^
#2 [const_eval] const-evaluating `cmp::Ordering::Less::{{constant}}`
end of query stack
error: aborting due to previous error


note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

note: rustc 1.30.0-beta running on powerpc64le-unknown-linux-gnu

note: compiler flags: -Z force-unstable-if-unmarked -C opt-level=2 -C link-args=-Wl,-z,relro -C prefer-dynamic -C debug-assertions=n --crate-type lib

note: some of the compiler flags provided by cargo are hidden

error: Could not compile `core`.
@infinity0
Copy link
Contributor Author

infinity0 commented Sep 25, 2018

The error did not occur on ppc64 big endian or powerpc 32-bit though the latter failed with a different reason:

thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', libcore/option.rs:error: aborting due to worker thread failure
345:21

stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
error: Could not compile `rustc`.

@infinity0
Copy link
Contributor Author

This error still occurs with 1.30.0 on Debian powerpc64le: https://buildd.debian.org/status/fetch.php?pkg=rustc&arch=ppc64el&ver=1.30.0%2Bdfsg1-1%7Eexp2&stamp=1540973484&raw=0

Strangely, it is fine on Fedora powerpc64le: https://kojipkgs.fedoraproject.org//packages/rust/1.30.0/6.fc30/data/logs/ppc64le/build.log

Not sure if it's related to #42778 or not. In that bug, the problem is on ppc64 big-endian but fine on little endian. In this bug, the problem is on little-endian and big-endian is fine.

@infinity0
Copy link
Contributor Author

@gnzlbg @cuviper @alexcrichton any help on how I can debug this?

The problem happens during Building stage1 std artifacts (powerpc64le-unknown-linux-gnu -> powerpc64le-unknown-linux-gnu)

@cuviper
Copy link
Member

cuviper commented Oct 31, 2018

Strangely, it is fine on Fedora powerpc64le:

It's fine on Fedora 29 and 30 (rawhide), which both have LLVM 7. Fedora 28 with LLVM 6 failed, as did Fedora 27 and EPEL7 with LLVM 5. They're all similar to what you report -- ppc64le hitting errors in Building stage1 std artifacts, while the other arches were fine.

I'm trying to debug this myself. Bisecting Rust got me down to commits 1ec8670 and 8053f63 of PR #54032 (cc @oli-obk), but I'm not sure why this would be a problem. The fact that it's only ppc64le and only with some LLVM versions (at least in my case) points pretty strongly to a codegen issue, but it's going to be hard to try and bisect LLVM when the issue seems to be in what stage0 produced.

For my next step, I'm going to try bootstrapping with upstream binaries (w/ LLVM ~8). If the problem shifts to stage1 rustc (w/ external LLVM) building stage2 std, then I can bisect LLVM more easily.

@cuviper cuviper added C-bug Category: This is a bug. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. O-PowerPC Target: PowerPC processors labels Oct 31, 2018
@cuviper
Copy link
Member

cuviper commented Oct 31, 2018

For my next step, I'm going to try bootstrapping with upstream binaries (w/ LLVM ~8). If the problem shifts to stage1 rustc (w/ external LLVM) building stage2 std, then I can bisect LLVM more easily.

Note that testing stage2 std requires --enable-full-bootstrap, otherwise it's just a copy from stage1.

Anyway, this worked! No errors at all. Furthermore, I used that bootstrap build as stage0 for a whole new build (local-rebuild style), and this worked too. Note that this is still landing on the older external LLVM in the end, so it appears just to be a transitional issue. I'm now testing a full rpmbuild bootstrapped in this manner, and if this works I'll just go with it.

While it would be nice to have a root cause for this, I can't justify spending much more time on it if there's a reasonable way to get past it. I suspect it is related to the stage0 conditionals in #54032 around applying the new range information (since 1.29 doesn't have that attribute), and this possibly affects codegen in weird ways.

@infinity0 If I'm reading correctly, your rustc-1.29 depends on libllvm6.0. This might explain why stage0 is causing problems for you even though your new build is with llvm-7-dev, assuming this is indeed a codegen issue fixed in later LLVM. Maybe you could get away with just rebuilding 1.29 with llvm-7, and then do the update to 1.30.

@oli-obk
Copy link
Contributor

oli-obk commented Nov 1, 2018

Bisecting Rust got me down to commits 1ec8670 and 8053f63 of PR #54032 (cc @oli-obk), but I'm not sure why this would be a problem. The fact that it's only ppc64le and only with some LLVM versions (at least in my case) points pretty strongly to a codegen issue, but it's going to be hard to try and bisect LLVM when the issue seems to be in what stage0 produced.

My changes together with a codegen bug can definitely result in such behavior. You can paper over this issue by cfg-ing out

#[rustc_layout_scalar_valid_range_end($max)]
on the relevant systems. Note that your compiler will probably regress performance somewhat and peak-memory a lot.

@infinity0
Copy link
Contributor Author

I managed to cross-compile 1.30.0 (with llvm 7) from amd64 to ppc64el, no errors there. Am about to start a rebuild of 1.30.0 ppc64el using itself, hopefully that works too.

@infinity0
Copy link
Contributor Author

The rebuild worked, so I guess there indeed was some bug with 1.29.0+llvm6 that is no longer present in 1.30.0+llvm7. I didn't try it with 1.29.0+llvm7 because it was easier to just cross-compile 1.30.0 on Debian.

@cuviper
Copy link
Member

cuviper commented Nov 2, 2018

My rebuilds completed successfully too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-PowerPC Target: PowerPC processors regression-from-stable-to-stable Performance or correctness regression from one stable version to another.
Projects
None yet
Development

No branches or pull requests

3 participants