Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diskful provision cannot boot from disk #449

Open
gustavoberman opened this issue Feb 14, 2025 · 8 comments
Open

diskful provision cannot boot from disk #449

gustavoberman opened this issue Feb 14, 2025 · 8 comments

Comments

@gustavoberman
Copy link

Hello there!

I'm in a VMs setup to test.
Following the instructions here: https://docs.clustervision.com/howtos/disk-full-node/

I can see the node booting/downloading image/partitioning/formating and then switching to that installation
But after reboot it does exactly the same, it reinstalls again

I was expecting it to boot from disk
So I changed the node netboot to no:
luna node change node003 -nb n

And that changed the booting process and no longer tries to reinstall everything
But it does not boot the system
It stays in a blank screen with just one line saying:

Boot from SAN device 0x80
In BIOS boot order is net->disk

I tried to select the option of

Exit and continue BIOS boot order

But the result is a similar one:

Blank screen saying:

Continuing BIOS boot order...

No more network devices

Booting from Hard Disk...

And stays like that

I also tried to boot from an ISO to check boot flags of partitions and I think it looks fine:

Image

Any ideas?
Thanks!

@aphmschonewille
Copy link
Member

aphmschonewille commented Feb 14, 2025

Probably a stupid question, but is a local disk boot configured in the BIOS, as in the list of bootable devices. Apologies if i ask a redundant question, but just have to make sure.

-A

@gustavoberman
Copy link
Author

Probably a stupid question, but is a local disk boot configured in the BIOS, as in the list of bootable devices. Apologies if i ask a redundant question, but just have to make sure.

-A

Yes, there is.
The order in bios is:
1- network
2- disk

@aphmschonewille
Copy link
Member

aphmschonewille commented Feb 16, 2025

There might be a compatibility issue with the vmware approach. Not all vmware instances handle sanboot requests for 0x80 well. We try to support all possible combinations, including uefi disks but sometimes a setup falls through the cracks. Good news though, since we use templates, these can easily be changed to what you need.

I'd try the following:

  • in the file /trinity/local/luna/daemon/templates/templ_boot_disk.cfg, replace the section under ':disk' with the following:
echo ${cls}
echo Continuing BIOS boot order...
sleep 1
exit

You can keep the original by adding # in front; comment out.

  • restart the luna2 daemon
  • boot the node

@gustavoberman
Copy link
Author

  • in the file /trinity/local/luna/daemon/templates/templ_boot_disk.cfg, replace the section under ':disk' with the following:
echo ${cls}
echo Continuing BIOS boot order...
sleep 1
exit

You can keep the original by adding # in front; comment out.

  • restart the luna2 daemon
  • boot the node

Sadly, it still does not boot, it shows the following but does not boot the OS:

Image

What's interesting is that CPU is doing something on node003:

Image

@aphmschonewille
Copy link
Member

Do you do uefi and/or secure boot?
It's hard to judge what goes wrong exactly.

What if you change the boot order temporarily in the bios and boot from disk. Does that work?

@gustavoberman
Copy link
Author

gustavoberman commented Feb 17, 2025

Do you do uefi and/or secure boot? It's hard to judge what goes wrong exactly.

What if you change the boot order temporarily in the bios and boot from disk. Does that work?

This was BIOS, not uefi nor secure boot.
Plain simple kvm qemu vm

I tried this:
Generated a new UEFI vm
After installing I changed the boot order for disk first and reboot
And got grub menu:

Image

But gives me this error:

Image

Config is:

Image

@aphmschonewille
Copy link
Member

aphmschonewille commented Feb 17, 2025

Am not sure why qemu does this, but that's different from the vmware attempts right? The above gave me the feeling it was vmware, not qemu. I have to guess that some qemu setting prevents an uefi boot? Maybe regular bios boot approach solves this, not knowing whether qemu supports this? This will probably result in a lot of trial and error. I can report however that this approach, diskless through scripts plugin, does work in vsphere.

The last one, not finding the kernel might be solved by making a symlink to /boot in /boot:

cd /boot
ln -s /boot boot

But you yeed to boot first. This can be solved by removing '/boot' from the grub menu as a one time attempt.

@gustavoberman
Copy link
Author

But you yeed to boot first. This can be solved by removing '/boot' from the grub menu as a one time attempt.

This worked and the node booted from disk!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants