-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CM5 without wifi hangs on reboot #6647
Comments
I did some further research and noticed that
What could be the reason that It also seems that if the After I performed a firmware update to A downgrade to Handover to OS is about 8-9 seconds, so I don't think that the difference is resulted by something like this. So it seems to me that this might be a firmware related issue or at least it has some influence. Did also some testing on a CM4 without wifi and there |
Hi Nicolai, we'll look into disabling SDIO2 from the firmware for non-WiFi-enabled parts. |
Thanks Phil for the update |
pieeprom_cm5nowifi.zip |
Give me some minutes and I will test it, I have modules at hand ... |
I can confirm,
Reboot is also working without hang / delay. |
Great. We'll get that merged, then into a release at some point. |
Thanks. In the meantime I will do some thinking and come up with some tooling for our end of line tests, so we can update the modules in place. |
Just a note for others which might need to work around the issue that the first reboot after firmware update still hangs (which is fine as we're still running the old firmware):
|
It's odd that a non-WiFi CM5 is rebooting without issue for me. I've tried rebooting before the The power mode difference is just an indicator of whether or not the kernel has given up on there being something on that SDIO bus - it turns off the power when it loses hope. |
Yes, at some point the device is rebooting (after the driver gives up on |
The patch to disable sdio2 has been merged, so future EEPROM builds will include it. I do wonder though if the kernel retry mechanism can be adjusted to not take quite so long. |
That was also I was initially thinking when I raised this issue. Haven't had the time to dig deeper what the differences for bcm2711 and 2712 are here, but from a first look they share at least the same driver for |
* recovery: Walk partitions to delete recovery.bin Previously, recovery.bin would fail to delete itself if the bootrom loaded recovery.bin where there are multiple FAT partitions and the first partition does not contain recovery.bin Update the rename code to walk the partition table to find the recovery.bin file to delete. * pi5: Add config filter for simple boot variable expressions (experimental) Add support for a new bootloader/config.txt conditional filter which tests the partition, boot_count and boot_arg1 variables. Syntax (no spaces): ARG boot_arg1, boot_count or partition (EEPROM config stage only) [ARG=VALUE] selected if (ARG == VALUE) [ARG&MASK] selected if ((ARG & VALUE) != 0)) [ARG&MASK=VALUE] selected if ((ARG & MASK) == VALUE) [ARG<VALUE] selected if (ARG < VALUE) [ARG>VALUE] selected if (ARG > VALUE) where VALUE and MASK are unsigned integer constants and ARG corresponds to the value in the reset register before the config file is parsed. * pi5: Add a boot-count bootloader variable (experimental) Store the boot-count in a reset register and increment just before the boot-order state-machine. The boot-count variable is visible via device-tree /proc/device-tree/chosen/bootloader/count and can be read/set via vcmailbox GET: sudo vcmailbox 0x0003008d 4 4 0 SET to N: sudo vcmailbox 0x0003808d 4 4 N * pi5: Add user-defined reboot argument (boot_arg1) (experimental) Add support for a user-defined boot parameter stored in a reset-safe scratch register on BCM2712. This is visible via device-tree at /proc/device-tree/chosen/bootloader/arg1 and via vcmailboxes GET arg1: sudo vcmailbox 0x0003008c 8 8 1 0 SET arg1 to 42: sudo vcmailbox 0x0003808c 8 8 1 42 or via config.txt set_reboot_arg1=42 The variable is NOT cleared automatically and will persist until a power-on-reset. * Enable overriding of high partition numbers Previously, the PARTITION=N bootloader config setting would only be used at power on reset or if the partition number passed to reboot was zero. Change the behaviour so that the bootloader config PARTITION property can override the reboot partition number if the reboot parameter is > 31. * Disable WiFi PMIC output on CM5 modules without WiFi Disable the 3.7V WiFi power supply on CM5 modules which do not have a WiFi module fitted. This fixes some stability issues where a CM5 would shutdown due to a spurious over-voltage condition on the non-connected WiFi power supply. * Add memory barrier to the mbox handler Firmware issue 1944 reports receiving kernel warnings about firmware requests where the status return code is 0. This should not be possible, as handle_mbox_property always sets the top bit of the return code, with the bottom bit indicating success or failure. If the firmware had died, the firmware driver would report a timeout due to the lack of a mailbox interrupt, and that isn't happening. See: raspberrypi/firmware#1944 * support dts files with size-cells of 2 DTS files with a top-level #size-cells of 2 make a lot of sense for systems with a lot of RAM, but the firmware is currently inconsistent in its support for that. Fix up the other cases to honor #size-cells and #address-cells. * Disable SDIO2 for CM5s without WiFi It has been observed that CM5s without WiFi hang on reboot. To prevent that, disable the sdio2 node on those devices. See: raspberrypi/linux#6647 * arm_dt: Use dtoverlay_enable_node Convert the open-coded DT node status changes to use the new dtoverlay method dtoverlay_enable_node. * dtoverlay: Add dtoverlay_enable_node Add a helper function for setting the status of a node.
* recovery: Walk partitions to delete recovery.bin Previously, recovery.bin would fail to delete itself if the bootrom loaded recovery.bin where there are multiple FAT partitions and the first partition does not contain recovery.bin Update the rename code to walk the partition table to find the recovery.bin file to delete. * pi5: Add config filter for simple boot variable expressions (experimental) Add support for a new bootloader/config.txt conditional filter which tests the partition, boot_count and boot_arg1 variables. Syntax (no spaces): ARG boot_arg1, boot_count or partition (EEPROM config stage only) [ARG=VALUE] selected if (ARG == VALUE) [ARG&MASK] selected if ((ARG & VALUE) != 0)) [ARG&MASK=VALUE] selected if ((ARG & MASK) == VALUE) [ARG<VALUE] selected if (ARG < VALUE) [ARG>VALUE] selected if (ARG > VALUE) where VALUE and MASK are unsigned integer constants and ARG corresponds to the value in the reset register before the config file is parsed. * pi5: Add a boot-count bootloader variable (experimental) Store the boot-count in a reset register and increment just before the boot-order state-machine. The boot-count variable is visible via device-tree /proc/device-tree/chosen/bootloader/count and can be read/set via vcmailbox GET: sudo vcmailbox 0x0003008d 4 4 0 SET to N: sudo vcmailbox 0x0003808d 4 4 N * pi5: Add user-defined reboot argument (boot_arg1) (experimental) Add support for a user-defined boot parameter stored in a reset-safe scratch register on BCM2712. This is visible via device-tree at /proc/device-tree/chosen/bootloader/arg1 and via vcmailboxes GET arg1: sudo vcmailbox 0x0003008c 8 8 1 0 SET arg1 to 42: sudo vcmailbox 0x0003808c 8 8 1 42 or via config.txt set_reboot_arg1=42 The variable is NOT cleared automatically and will persist until a power-on-reset. * Enable overriding of high partition numbers Previously, the PARTITION=N bootloader config setting would only be used at power on reset or if the partition number passed to reboot was zero. Change the behaviour so that the bootloader config PARTITION property can override the reboot partition number if the reboot parameter is > 31. * Disable WiFi PMIC output on CM5 modules without WiFi Disable the 3.7V WiFi power supply on CM5 modules which do not have a WiFi module fitted. This fixes some stability issues where a CM5 would shutdown due to a spurious over-voltage condition on the non-connected WiFi power supply. * Add memory barrier to the mbox handler Firmware issue 1944 reports receiving kernel warnings about firmware requests where the status return code is 0. This should not be possible, as handle_mbox_property always sets the top bit of the return code, with the bottom bit indicating success or failure. If the firmware had died, the firmware driver would report a timeout due to the lack of a mailbox interrupt, and that isn't happening. See: raspberrypi/firmware#1944 * support dts files with size-cells of 2 DTS files with a top-level #size-cells of 2 make a lot of sense for systems with a lot of RAM, but the firmware is currently inconsistent in its support for that. Fix up the other cases to honor #size-cells and #address-cells. * Disable SDIO2 for CM5s without WiFi It has been observed that CM5s without WiFi hang on reboot. To prevent that, disable the sdio2 node on those devices. See: raspberrypi/linux#6647 * arm_dt: Use dtoverlay_enable_node Convert the open-coded DT node status changes to use the new dtoverlay method dtoverlay_enable_node. * dtoverlay: Add dtoverlay_enable_node Add a helper function for setting the status of a node.
The rescan code tries 3 different card types at 4 different clock frequencies. All of those tests involve timeouts of specific durations, so they shouldn't simply be shortened. The other approach would be to make the scanning interruptable at some granularity - at least between frequencies. There may be a way to mark that the interface is being shut down - perhaps using the |
same issue with Pi5
|
This issue smells very similar to this: https://forums.raspberrypi.com/viewtopic.php?t=288866
After running these two commands (with mmc0), I am able to shutdown my CM5Lite, booted off NVMe, no SD card inserted, with no hang. Though, the unbind takes ~24s intermittently (sometimes <100ms, sometimes 20-50s) which is not ideal. My current workaround is to just disable the interface entirely with a dtoverlay...but it would be nice to be able to still have an SD card work.
|
There is already an overlay for this (its called disable-wifi or wlan i think). But this shouldn't be necessary with the firmware update. Did you already update the eeprom on your cm5? If not: sudo rpi-eeprom-update -a |
still having this problem with the raspberry pi compute module 5 with linux 6.6.51 6.6.74 and 6.12.19 from rpi-update. I also have the latest eeprom with sudo rpi-eeprom-update -a. It works once after updating the linux version but after rebooting once it goes back to the same issue where it is stuck on watchdog0 or systemd halt when doing both reboot and halt. This is from a fresh install of raspberry pi os lite 64 bit from raspberry pi imager. Can anybody help me with this issue? |
Report output of |
root@raspberrypi:~# vcgencmd bootloader_version |
Okay, that should contain the fix referenced here. |
whenever i try to shutdown or reboot the pi it still has the same issue of stalling inbetween 20-50 seconds and dmesg still reports mmc errors even when the sd card is disabled in config.txt i dont know how to fix this issue and i have also tried multiple io boards with still the same issue |
What does |
tc@raspberrypi:~ $ sudo vclog -m |
Thanks.
These lines show that the firmware has detected your no-WiFi CM5 and disabled Bluetooth and WiFi (or at least attempted to). The rest shows that you have several other overlays and parameters in there. Please remove them (or comment them out) for testing purposes. |
i am sorry these parameters were from a non fresh install let me do a fresh install to remove any extra variabels. default settings everything i only did a sudo apt update and upgrade. When rebooting the issue persists. Here is my sudo vclog -m:
This is my version of raspberry pi os lite 64 bit:
And here is my bootloader version:
still hanging after watchdog 0: 1/10 times it reboots instantly but 9/10 times hanging between 20-50 secs. No SD card is inserted. |
sorry i don't know how to fix the layout issue to make it more readable i am quite new to github |
also just found this in dmesg: |
What do these commands report?
|
[ I've added to the list of things to try ] |
Do you know how i can entirely disable mmc0 for now because i have an sd card slot which is nice but i am using the NVME drive for booting so i am currently not using the sd card slot. At least i think the problem is because of mmc0 because when an sd card is inserted the issue is gone but without one it is there. also it isn't disabling wifi like the person that originally had the problem and updated the eeprom of the raspberry pi.
|
when inserting an sd card dmesg shows this.
with this it reboots and shutdowns properly :/ when testing with it inserted when booting and removing before halting or rebooting in hangs again
when testing with the card removed when booting and adding the card while rebooting it also reboots and halts just fine. also i noticed that it sometimes just hanged even with the sd card but after adding dtoverlay=disable-wifi it fixed that part of the issue but i still have issues with mmc0 i believe. |
I see you've gone back and added a lot of important information to your comments - this isn't great, because we don't get any nofication that you've done so. The most significant addition is this one:
It seems as though your issue isn't really about the WiFi and BT interfaces any more, but rather it's the normal SD card that must be present in order to guarantee a prompt reboot. This is a variant of the same problem - that SD/MMC probing can be slow to interrupt - but the difference is that the firmware doesn't know you won't be using an SD card and therefore doesn't (and currently can't) disable the SD interface. This isn't an issue on Pi 5 because there is a card detect signal, but not so on Pi 5, so some other approach is required. It would be simple to add a dtparam or dtoverlay to disable the mmc0 SD interface - getting the SD interface to give up quicker when rebooting is likely to be more difficult. |
Yes sorry for editting my posts and now making new ones i was in the middle of testing and did not want to make 10 new posts under here. There should already exist a dt param to disable sd card but for me it didnt detect it. Could you guide me into disabling the SD card because it is non essential? Should i make a new issue on this topic or can we continue in this thread? i also always notice the problem goes away after getting a fresh install or removing the wifi module in config.txt but after a maximum of two reboots just returns immediatly. The compute module 5 should also have an sd detect pin directly to the sd card slot so i am confused as to why this issue exists. Also i am currently testing with the raspberry pi cm5io board and the waveshare cm5io POE board. The results are the same on both. |
The CM5 lacks a card detect signal, so it can be useful to be able to disable the external SD card slot (or onboard EMMC on a non-Lite board). See: raspberrypi#6647 Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The new (to CM5) dtparams You would test it with |
okay i will prepare a fresh drive and try it on it then. i will edit my post with it working or not. |
on a fresh install i keep getting this error after doing sudo apt update not upgrade and afterwards sudo rpi-update pulls/6744
edit: NVM noticed my mistake and put the wrong number after pulls/ i should just copy it instead of typing over |
okay after updating i am at kernel version 6.12.20-v8-16k+
wifi is still not being disabled by default which i find weird. i am also getting a weird bug that i had earlier when running regular sudo raspi-update where randomly when running certain commands like nano cpu load spikes to 25% from 0 and stays there and hangs my ssh connection. also when booting it hangs at two spots for around 30 seconds i did confirm that using dtparam=sd=off removed the mmc0 module and it reboots and shutdowns without hanging. However because of the instability and long boot time i cannot use this build currently for regular use.
If this issue can be fixed for the next stable branch, patch update, or how i can add it myself will be a huge help. |
it seems the hang after 3 seconds was fixed by disabling wifi and bluetooth. It still liked to randomly hang when using commands like dmesg and nano. when trying the command it hangs and cpu load goes to 25 percent. probably something corrupt with the drive after updating. that it is taking this long. after idling for a while i get this in dmesg.
it does have something to do with the boot drive after updating. |
Describe the bug
We stumbled over an issue where all CM5 without wifi seem to hang when rebooted. After some waiting the reboot is completed whereas all CM5 with wifi show no such error (same base boards, same software). As is some care cases the reboot even worked on CM5 without wifi I started to debug it further.
When reboot hangs:
When reboot works immediately:
The culprit seems to be (always present when the reboot works):
So it looks like there might be an issue with the unused sdio/
mmc1
which is not used on the wifi less variant of CM5. In order to verify my suspicion I've created a simple overlay which deactivatessdio 2
completely:With this the reboot works reliable in all tests so far. Even though it kinda works with a custom overlay it looks wrong. It also is not a reliable solution for production as during first boot only the cm5io dt loaded by the firmware is present and a subsequent reboot will fail very often.
Same works on CM4 with / without wifi (different overlay though, but should be irrelevant as it also happens with pure CM dt).
Any ideas / insights on this?
Steps to reproduce the behaviour
sudo reboot
Device (s)
Raspberry Pi CM5
System
EEPROM release: 1727096576
Kernel:
6.6.74+rpt-rpi-v8
Logs
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: