-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Refactor git change detection in bootstrap #138591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR changes how GCC is built. Consider updating src/bootstrap/download-ci-gcc-stamp. These commits modify the If this was unintentional then you should revert the changes before this PR is merged. This PR changes how LLVM is built. Consider updating src/bootstrap/download-ci-llvm-stamp. Some changes occurred in src/tools/compiletest cc @jieyouxu This PR modifies If appropriate, please update |
LGTM on the Ferrocene side. There is nothing here that would break our downstream usage. On the Rust side, I recommend opening this PR against stable and beta too, and running a full bors try on it. We had issues in past releases where changes to this code would unexpectedly break stable or beta CI, and I'd love for those to be catched before merging. |
Yes, I planned to do that, it's a good idea. Actually, I can try that right away. |
[do not merge] beta test for git change detection (rust-lang#138591) Opening to test CI/bootstrap changes. r? `@ghost` try-job: x86_64-gnu-stable try-job: x86_64-gnu try-job: x86_64-gnu-llvm-19-1 try-job: dist-x86_64-linux
[do not merge] beta test for git change detection (rust-lang#138591) Opening to test CI/bootstrap changes from rust-lang#138591. r? `@ghost` try-job: x86_64-gnu-stable try-job: x86_64-gnu try-job: x86_64-gnu-llvm-19-1 try-job: dist-x86_64-linux
[do not merge] beta test for git change detection (rust-lang#138591) Opening to test CI/bootstrap changes from rust-lang#138591. r? `@ghost` try-job: x86_64-gnu-stable try-job: x86_64-gnu try-job: x86_64-gnu-llvm-19-1 try-job: dist-x86_64-linux
The changes look good, but I am not sure if they will break the if-unchanged tests and logic in the following cases:
I think it's safer to make sure these won't be a problem before merging this. |
[do not merge] beta test for git change detection (rust-lang#138591) Opening to test CI/bootstrap changes from rust-lang#138591. r? `@ghost` try-job: x86_64-gnu-aux
[do not merge] beta test for git change detection (rust-lang#138591) Opening to test CI/bootstrap changes from rust-lang#138591. r? `@ghost` try-job: x86_64-gnu-aux
@bors try |
Refactor git change detection in bootstrap While working on rust-lang#138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download `[llvm|rustc|gcc]`. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch. The previous approach had a bunch of limitations: - It separated the computation of "are there local changes?" and "what upstream SHA should we use?" even though these two things are intertwined. - It used hacks to work around what happens on CI. - It had special cases for CI scattered throughout the codebase, rather than centralized in one place. - It wasn't documented enough and didn't have tests for the git behavior. The current approach should hopefully resolve all of that. I implemented a single entrypoint called `check_path_modifications` (naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implement `download-ci-X` and other functionality. I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds). The tests are super fast and run in parallel, as they are currently in `build_helper` and not in `bootstrap`. After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the `last_modified_commit` function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up. In the future we could cache the results of `check_path_modifications` to reduce the number of git invocations, but I don't think that it should be excessive even now. I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :) The new approach explicitly supports three scenarios: - Running on PR CI, where we have one upstream bors parent commit and one PR merge commit made by GitHub. - Running on try/auto CI, where we have one upstream bors parent commit and one PR merge commit made by bors. - Running locally, where we assume that we have at least one upstream bors parent commit in our git history. I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :) In particular, the code before relied on `git merge-base`, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :) CC `@pietroalbini` To make sure that this won't break downstream users of Rust's CI. Best reviewed commit by commit. Companion PRs: - For testing beta: rust-lang#138597 r? `@onur-ozkan` try-job: x86_64-gnu-aux
Did a bunch of follow-up clean-ups. Let me know if you want me to split this into multiple PRs! :) |
63be8ba
to
e1fe7f2
Compare
Hmm, weird, @bors try |
Refactor git change detection in bootstrap While working on rust-lang#138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download `[llvm|rustc|gcc]`. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch. The previous approach had a bunch of limitations: - It separated the computation of "are there local changes?" and "what upstream SHA should we use?" even though these two things are intertwined. - It used hacks to work around what happens on CI. - It had special cases for CI scattered throughout the codebase, rather than centralized in one place. - It wasn't documented enough and didn't have tests for the git behavior. The current approach should hopefully resolve all of that. I implemented a single entrypoint called `check_path_modifications` (naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implement `download-ci-X` and other functionality. Notably, this change detection no longer uses `git merge-base`, which makes it easier to use and doesn't require setting up remotes. I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds). After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the `last_modified_commit` function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up. I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :) The new approach explicitly supports three scenarios: - Running on PR CI, where we have one upstream bors parent commit and one PR merge commit made by GitHub. - Running on try/auto CI, where we have one upstream bors parent commit and one PR merge commit made by bors. - Running locally, where we assume that we have at least one upstream bors parent commit in our git history. I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :) In particular, the code before relied on `git merge-base`, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :) CC `@pietroalbini` To make sure that this won't break downstream users of Rust's CI. Best reviewed commit by commit. Companion PRs: - For testing beta: rust-lang#138597 r? `@onur-ozkan` Fixes: rust-lang#101907 try-job: x86_64-gnu-aux try-job: aarch64-gnu try-job: dist-x86_64-apple
☀️ Try build successful - checks-actions |
Hmm, looks like it might have been spurious. @bors r=Mark-Simulacrum |
💡 This pull request was already approved, no need to approve it again.
|
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 1a5bf12 (parent) -> 645d0ad (this PR) Test differencesShow 18 test diffsStage 0
Job group index Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 645d0ad2a4f145ae576e442ec5c73c0f8eed829b --output-dir test-dashboard And then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
Finished benchmarking commit (645d0ad): comparison URL. Overall result: ❌ regressions - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 2.0%, secondary 0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 774.578s -> 776.892s (0.30%) |
I checked the CI logs and didn't find anything suspicious. Eyeballing the historical charts of the regressed benchmarks, my only conclusion is that this has to be noise, as this PR shouldn't have affected dist builders nor the compiler in any way. |
…git-config, r=jieyouxu Remove git repository from git config It is no longer needed after rust-lang#138591. We could even remove the `nightly_branch` field, but it still has one usage. r? `@jieyouxu`
When I run the bootstrap tests locally I get the failure new failures. Could this PR be responsible?
|
Yes, that is definitely caused by this PR, because it introduced the test. I can't seem to reproduce it locally though. Does it fail always or only sometimes? |
…git-config, r=jieyouxu Remove git repository from git config It is no longer needed after rust-lang#138591. We could even remove the `nightly_branch` field, but it still has one usage. r? ``@jieyouxu``
It always fails: with a clean tree, with uncommitted changes, or with committed changes. |
Rollup merge of rust-lang#140191 - Kobzol:remove-git-repository-from-git-config, r=jieyouxu Remove git repository from git config It is no longer needed after rust-lang#138591. We could even remove the `nightly_branch` field, but it still has one usage. r? ``@jieyouxu``
…g, r=jieyouxu Remove git repository from git config It is no longer needed after rust-lang/rust#138591. We could even remove the `nightly_branch` field, but it still has one usage. r? ``@jieyouxu``
Just to clarify, the test runs git commands in a temporary directory, so it shouldn't be affected in any way by the git state of your actual rustc checkout (otherwise it would be very brittle, of course). The test fails when it tries to do git commit, but the working area is clean. That is weird, because right before that we essentially do |
I'm using git 2.45.2 on Ubuntu 24.10. |
While working on #138395, I finally found the courage to delve into the insides of git path change detection in bootstrap, which is used (amongst other things) to detect if we should rebuilt od download
[llvm|rustc|gcc]
. I found it a bit hard to understand, and given that this code was historically quite fragile, I thought that it would be better to rebuild it from scratch.The previous approach had a bunch of limitations:
The current approach should hopefully resolve all of that. I implemented a single entrypoint called
check_path_modifications
(naming bikeshed pending, half of the time I spend on this PR was thinking about names, as it's quite tricky here..) that explicitly receives a mode of operation (in CI or outside CI), and accordingly figures out that upstream SHA that we should use for downloading artifacts and it also figures out if there are any local changes. Users of this function can then use this unified output to implementdownload-ci-X
and other functionality. Notably, this change detection no longer usesgit merge-base
, which makes it easier to use and doesn't require setting up remotes.I also added a bunch of integration tests that literally spawn a git repository on disk and then check that the function can deal with various situations (PR CI, auto/try CI, local builds).
After I built this inner layer, I used it for downloading GCC, LLVM and rustc. The latter two (and especially rustc) were using the
last_modified_commit
function before, but in all cases but one this function was actually only used to check if there are any local changes, which was IMO confusing. The LLVM handling would deserve a bit of refactoring, but that's a larger change that can be done as a follow-up.I hope that the implementation is now clear and easy to understand, so that in combination with the tests we can have more confidence that it does what we want. I tried to include a lot of documentation in the code, so I won't be repeating the actual implementation details here, if there are any questions, I'll add the answers to the documentation too :)
The new approach explicitly supports three scenarios:
I removed the handling of upstreams on CI, as I think that it shouldn't be needed and I considered it to be a hack. However, it's possible that there are other use-cases that I haven't considered, so I want to ask around if people have other situations than the three use-cases described above. If there are other such use-cases, I would like to include them in the new centralized implementation and add them to the git test suite, rather than going back to the old ways :)
In particular, the code before relied on
git merge-base
, but I don't see why we can't just lookup the most recent bors commit and assume that is a merge commit that is also upstream? I might be running into Chesterton's Fence here :)CC @pietroalbini To make sure that this won't break downstream users of Rust's CI.
Best reviewed commit by commit.
Companion PRs:
r? @onur-ozkan
Fixes: #101907
try-job: x86_64-gnu-aux
try-job: aarch64-gnu
try-job: dist-x86_64-apple