-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement secure node join flow #924
feat: implement secure node join flow #924
Conversation
0f385a4
to
75734bf
Compare
message LinkStatusSpec { | ||
string node_subnet = 1; | ||
string node_public_key = 2; | ||
string virtual_addrport = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why virtual?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's for the wireguard over gRPC. We keep it there to keep track of it being updated.
internal/backend/runtime/omni/controllers/omni/pending_machine_status.go
Outdated
Show resolved
Hide resolved
internal/backend/runtime/omni/controllers/omni/pending_machine_status.go
Outdated
Show resolved
Hide resolved
internal/backend/runtime/omni/controllers/omni/pending_machine_status.go
Outdated
Show resolved
Hide resolved
func getClient( | ||
ctx context.Context, | ||
r controller.Reader, | ||
pendingMachine *siderolink.PendingMachine, | ||
) (*client.Client, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have more or less the same code in many places I think - should we consider moving them to a central place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've extracted that and reused that in the place where I copy-pasted it from. Other places are slightly different.
0a4f343
to
f72cfa6
Compare
internal/backend/runtime/omni/controllers/omni/pending_machine_status.go
Show resolved
Hide resolved
internal/backend/runtime/omni/controllers/omni/pending_machine_status.go
Outdated
Show resolved
Hide resolved
f72cfa6
to
45ba1e5
Compare
Fixes: siderolabs#840 This PR changes the Talos machine join flow drastically: - newly joined machine first put into a limbo state where Omni creates a temporary Wireguard connection to it. - the controller picks up and tries to write a unique machine token to the newly joined machine, in the mean time it also resolves UUID conflicts automatically and writes UUID override to the META partition. - the machine re-joins Omni, now with the unique token. - the unique token is saved in the `siderolink.Link` resource and any subsequent join checks that `siderolink.Link` has matching unique token. Siderolink manager was refactored, as it was a huge monolithic poorly testable chunk, it was split to: - LinkStatus controller, which creates/removes wireguard peers. - PendingMachineStatus controller, which ensures all joined machines have unique node tokens. - Provision handler, which implements gRPC server and has all logic related to the machine acceptance now. - PeersPool, which is used by LinkStatus controllers and deduplicate peers creation, reuse them when possible. Additionally updated siderolink loghandler to not accept logger connection for the machines which do not have corresponding log buffers. Nodes which do not support secure flow are still able to join by default. Secure join flow can be forced by setting `--disable-legacy-join-tokens` flag. Signed-off-by: Artem Chernyshev <[email protected]>
45ba1e5
to
9bb85f8
Compare
/m |
Fixes: #840
This PR changes the Talos machine join flow drastically:
siderolink.Link
resource and any subsequent join checks thatsiderolink.Link
has matching unique token.Siderolink manager was refactored, as it was a huge monolithic poorly testable chunk, it was split to:
Additionally updated siderolink loghandler to not accept logger connection for the machines which do not have corresponding log buffers.
Nodes which do not support secure flow are still able to join by default.
Secure join flow can be forced by setting
--disable-legacy-join-tokens
flag.