Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARMageddon: When Bootloader Configuration Goes Rogue #1331

Merged
merged 4 commits into from
Jan 29, 2025

Conversation

MichalHe
Copy link
Member

@MichalHe MichalHe commented Jan 16, 2025

What needs to be done:

  • - modify code so that /boot/efi/EFI/redhat/grub2.cfg is patched to include grubenv from the correct BLS entry
  • - Create leapp's own BLS directory /boot/upgrade-loaders mimicking /boot/loaders and place there only leapp's BLS entry
  • - modify grubenv in /boot/efi/EFI/leapp/grubenv (which is now properly loaded) to contain blsdir upgrade-loaders. This way, booting into the upgrade EFI entry means that user can only continue with the upgrade, and he/she cannot boot into RHEL8 (with new bootloader)

jira: RHEL-41193

Copy link

Thank you for contributing to the Leapp project!

Please note that every PR needs to comply with the Leapp Guidelines and must pass all tests in order to be mergeable.
If you want to request a review or rebuild a package in copr, you can use following commands as a comment:

  • review please @oamg/developers to notify leapp developers of the review request
  • /packit copr-build to submit a public copr build using packit

Packit will automatically schedule regression tests for this PR's build and latest upstream leapp build.
However, here are additional useful commands for packit:

  • /packit test to re-run manually the default tests
  • /packit retest-failed to re-run failed tests manually
  • /packit test oamg/leapp#42 to run tests with leapp builds for the leapp PR#42 (default is latest upstream - main - build)

Note that first time contributors cannot run tests automatically - they need to be started by a reviewer.

It is possible to schedule specific on-demand tests as well. Currently 2 test sets are supported, beaker-minimal and kernel-rt, both can be used to be run on all upgrade paths or just a couple of specific ones.
To launch on-demand tests with packit:

  • /packit test --labels kernel-rt to schedule kernel-rt tests set for all upgrade paths
  • /packit test --labels beaker-minimal-8.10to9.4,kernel-rt-8.10to9.4 to schedule kernel-rt and beaker-minimal test sets for 8.10->9.4 upgrade path

See other labels for particular jobs defined in the .packit.yaml file.

Please open ticket in case you experience technical problem with the CI. (RH internal only)

Note: In case there are problems with tests not being triggered automatically on new PR/commit or pending for a long time, please contact leapp-infra.

@pirat89 pirat89 added the bug Something isn't working label Jan 17, 2025
@pirat89 pirat89 added this to the 8.10/9.6 milestone Jan 17, 2025
@MichalHe MichalHe force-pushed the arm_custom_grubcfg branch 2 times, most recently from 1f2aad3 to 4b16b71 Compare January 20, 2025 15:14
Use the grub.cfg bundled within leapp if we detect that
system's grub.cfg contains problematic configuration which
will not load grubenv of the upgrade BLS entry. We need
to ensure that this grubenv is loaded, as without it we
cannot guarantee a successful boot into upgrade environment.
@MichalHe MichalHe force-pushed the arm_custom_grubcfg branch 5 times, most recently from 4097f19 to 730fccc Compare January 28, 2025 14:29
Michal Hecko added 3 commits January 28, 2025 15:42
Use a separate BLS directory '/boot/upgrade-loader/entries'
that mimics '/boot/loader/entries'. This allows very fine
control of what boot entries are available when booting
into upgrade environment via a separate EFI entry.
Move model used to implement arm bootloader workarounds to common
as this model will be also used when adding/removing kernel entries
to use custom blsdir.
@pirat89 pirat89 added the changelog-checked The merger/reviewer checked the changelog draft document and updated it when relevant label Jan 29, 2025
Copy link
Member

@pirat89 pirat89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall solution is pretty awesome technically! I haven't found any functional problem in the code, just minor nit-picks - mainly doc related. E.g. someone could be possibly confused with situation that the ArmWorkaroundEFIBootloaderInfo is defined and consumed in the common repo but it's produced only in el8toel9 repo. In general this is valid scenario, like in case of tasks, etc. which could be produced from various repositories but it's processed in the common one. In this particular case however it has very specific use-case so it we should document it better in future. Possibly move rest of related actor to the common repo so it's everything part of the one repo instead. But as told earlier, this is a nit-pick and it does not have a real impact.

Tested manually on AWS 8.10 -> 9.5, works as expected

  • original efibootentry is unafected before the reboot
  • the upgrade boot entry is separated - present only in the upgrade efi boot entry
  • System after the upgrade seems clean regarding the bootloader and /boot content

logs

  • prior the reboot
root@ip-172-31-41-18 ~]# cd /boot
[root@ip-172-31-41-18 boot]# ls
config-4.18.0-553.34.1.el8_10.aarch64  initramfs-0-rescue-ffffffffffffffffffffffffffffffff.img  loader                                     vmlinuz-0-rescue-ffffffffffffffffffffffffffffffff
dtb-4.18.0-553.34.1.el8_10.aarch64     initramfs-4.18.0-553.34.1.el8_10.aarch64.img             symvers-4.18.0-553.34.1.el8_10.aarch64.gz  vmlinuz-4.18.0-553.34.1.el8_10.aarch64
efi                                    initramfs-4.18.0-553.34.1.el8_10.aarch64kdump.img        System.map-4.18.0-553.34.1.el8_10.aarch64  vmlinuz-upgrade.aarch64
grub2                                  initramfs-upgrade.aarch64.img                            upgrade-loader
[root@ip-172-31-41-18 boot]# find upgrade-loader/
upgrade-loader/
upgrade-loader/ec29cb44830e5764d8db201b554ccb9c-upgrade.aarch64.conf
[root@ip-172-31-41-18 boot]# find loader/
loader/
loader/entries
loader/entries/ffffffffffffffffffffffffffffffff-0-rescue.conf
loader/entries/ffffffffffffffffffffffffffffffff-4.18.0-553.34.1.el8_10.aarch64.conf
[root@ip-172-31-41-18 boot]# grubby --info ALL | grep vmlinuz
kernel="/boot/vmlinuz-4.18.0-553.34.1.el8_10.aarch64"
kernel="/boot/vmlinuz-0-rescue-ffffffffffffffffffffffffffffffff"
[root@ip-172-31-41-18 boot]# efibootmgr
BootNext: 0003
BootCurrent: 0001
Timeout: 0 seconds
BootOrder: 0003,0002,0000,0001
Boot0000* UiApp
Boot0001* UEFI Amazon Elastic Block Store vol059f500503552513d 1
Boot0002* Red Hat Enterprise Linux
Boot0003* Leapp Upgrade
[root@ip-172-31-41-18 boot]# 
### I forgot to take output of the upgrade grub config
  • after the upgrade (before manual post-upgrade steps)
[root@ip-172-31-41-18 ~]# uname -r
5.14.0-503.22.1.el9_5.aarch64

[root@ip-172-31-41-18 ~]# efibootmgr 
BootCurrent: 0001
Timeout: 0 seconds
BootOrder: 0002,0000,0001
Boot0000* UiApp
Boot0001* UEFI Amazon Elastic Block Store vol059f500503552513d 1
Boot0002* Red Hat Enterprise Linux

[root@ip-172-31-41-18 ~]# grubby --info ALL | grep vmlinuz
kernel="/boot/vmlinuz-4.18.0-553.34.1.el8_10.aarch64"
kernel="/boot/vmlinuz-0-rescue-ffffffffffffffffffffffffffffffff"
kernel="/boot/vmlinuz-5.14.0-503.22.1.el9_5.aarch64"
kernel="/boot/vmlinuz-0-rescue-ec29cb44830e5764d8db201b554ccb9c"

[root@ip-172-31-41-18 boot]# find /boot/efi/EFI/
/boot/efi/EFI/
/boot/efi/EFI/BOOT
/boot/efi/EFI/BOOT/BOOTAA64.EFI
/boot/efi/EFI/BOOT/fbaa64.efi
/boot/efi/EFI/redhat
/boot/efi/EFI/redhat/grubaa64.efi
/boot/efi/EFI/redhat/BOOTAA64.CSV
/boot/efi/EFI/redhat/mmaa64.efi
/boot/efi/EFI/redhat/shim.efi
/boot/efi/EFI/redhat/shimaa64-redhat.efi
/boot/efi/EFI/redhat/shimaa64.efi
/boot/efi/EFI/redhat/grub.cfg
/boot/efi/EFI/redhat/grubenv.rpmsave

Comment on lines +195 to +197
def _notify_user_to_check_grub2_cfg():
# Or maybe rather ask a question in a dialog? But this is rare, so maybe continuing is fine.
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichalHe forgotten deadcode? :)

@pirat89
Copy link
Member

pirat89 commented Jan 29, 2025

the failed test is internal test error.

@pirat89 pirat89 merged commit 31af8f4 into oamg:main Jan 29, 2025
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working changelog-checked The merger/reviewer checked the changelog draft document and updated it when relevant
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants