Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: set readyToUse: false only when the machine is dirty #22

Closed

Conversation

utkuozdemir
Copy link
Member

@utkuozdemir utkuozdemir commented Jan 22, 2025

Change the logic of infraMachineStatus.readyToUse flag to represent the dirtiness state, i.e., if there is a pending wipe for that machine.

Depends on siderolabs/omni#868 to pass tests.

Closes siderolabs/omni#850.


infraMachineStatus.TypedSpec().Value.ReadyToUse = false

return controller.NewRequeueInterval(1 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will get persisted if we return nil here too. Why do we need to requeue?

Copy link
Member Author

@utkuozdemir utkuozdemir Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an overall improvement of sorts - I thought it's probably a better idea to only persist the new state in the happy path. To avoid unwanted modifications.

Nvm, didn't read the question properly. The actual reason is, after we mark the machine as "not ready to use", we trigger a reconcile, so that the next reconcile will not fall into this if condition, and the pending wipes/reboots will be processed.

But because we want the "not ready to use" flag to be updated immediately so in can be observed in Omni, we do "save & requeue".

Comment on lines 214 to 213
// the changes will trigger a new reconciliation, we can simply return here
return nil
return controller.NewRequeueErrorf(h.requeueInterval, "wiped the machine, requeue")
Copy link
Member

@Unix4ever Unix4ever Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment no longer reflects the code.
Why do we need to requeue here?

Copy link
Member Author

@utkuozdemir utkuozdemir Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comment ✅ .
The reason is: I thought not persisting the changes if we have pending tasks would be "safer". But actually we don't need to return here at all, so I removed the return now.

@utkuozdemir utkuozdemir force-pushed the fix-ready-to-use-flag branch 2 times, most recently from c0960ed to 38c20e8 Compare January 24, 2025 10:22
Change the logic of `infraMachineStatus.readyToUse` flag to represent the dirtiness state, i.e., if there is a pending wipe for that machine.

Signed-off-by: Utku Ozdemir <[email protected]>
@utkuozdemir utkuozdemir force-pushed the fix-ready-to-use-flag branch from 38c20e8 to 8227a3d Compare January 24, 2025 10:25
@utkuozdemir
Copy link
Member Author

pulled in in #25, closing.

@utkuozdemir utkuozdemir deleted the fix-ready-to-use-flag branch January 26, 2025 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐛 machines provisioned by the bare-metal infra provider get stuck in reconfiguring
3 participants