Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kops 1.31 - kops update cluster instance-group-role targeting causes side-effects #17294

Open
mkoepke-xion opened this issue Feb 28, 2025 · 0 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@mkoepke-xion
Copy link

mkoepke-xion commented Feb 28, 2025

/kind bug

1. What kops version are you running? The command kops version, will display
this information.

1.31.0

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

1.31.6

4. What commands did you run? What is the simplest way to reproduce this issue?

kops update cluster --instance-group-role=control-plane --yes
kops rolling-update cluster --instance-group-role=control-plane --yes
kops update cluster --instance-group-role=node --yes
kops rolling-update cluster --instance-group-role=node --yes
kops rolling-update cluster
NAME		STATUS		NEEDUPDATE	READY	MIN	TARGET	MAX	NODES
bastion-zone-01	Ready		0		1	1	1	2	0
master-zone-01	Ready		1		1	1	1	1	1
master-zone-02	Ready		1		1	1	1	1	1
master-zone-03	Ready		1		1	1	1	1	1
node-zone-01	NeedsUpdate	0		3	3	3	6	3
node-zone-02	NeedsUpdate	0		3	3	3	6	3
node-zone-03	NeedsUpdate	0		3	3	3	6	3

kops update cluster --instance-group-role=control-plane

I0228 11:25:46.077157    2426 executor.go:113] Tasks: 0 done / 123 total; 53 can run
I0228 11:25:47.337779    2426 executor.go:113] Tasks: 53 done / 123 total; 45 can run
I0228 11:25:47.642155    2426 executor.go:113] Tasks: 98 done / 123 total; 9 can run
I0228 11:25:47.960137    2426 executor.go:113] Tasks: 107 done / 123 total; 2 can run
I0228 11:25:48.263454    2426 executor.go:113] Tasks: 109 done / 123 total; 5 can run
I0228 11:25:49.489246    2426 executor.go:113] Tasks: 114 done / 123 total; 6 can run
I0228 11:25:49.936979    2426 executor.go:113] Tasks: 120 done / 123 total; 3 can run
I0228 11:25:51.164291    2426 executor.go:113] Tasks: 123 done / 123 total; 0 can run
Will modify resources:
  ManagedFile/xi-paas-staging.xiaas2.k8s.local-addons-bootstrap
  	Contents
  	                    	...
  	                    	    - id: k8s-1.16
  	                    	      manifest: networking.cilium.io/k8s-1.16-v1.15.yaml
  	                    	+     manifestHash: e36e05e73bb68c69546064c016187d99c57bbe154167a58555bc93d16844604a
  	                    	-     manifestHash: 9133199d404c1951330e82c5fb81441e512f6dbb64c31f9e537f9a8595a1565b
  	                    	      name: networking.cilium.io
  	                    	      needsRollingUpdate: all
  	                    	...


  ManagedFile/xi-paas-staging.xiaas2.k8s.local-addons-networking.cilium.io-k8s-1.16
  	Contents
  	                    	...
  	                    	    namespace: kube-system
  	                    	  spec:
  	                    	+   replicas: 2
  	                    	-   replicas: 1
  	                    	    selector:
  	                    	      matchLabels:
  	                    	...

5. What happened after the commands executed?

control-plane nodes where marked as NeedsUpdate.

6. What did you expect to happen?

no node in the cluster needs an update as we did a complete update of the cluster with update and rolling-update targeting all nodes.

9. Anything else do we need to know?

We are running cilium and an HA control plane. Using kops update --instance-group-role=nodes seems to cause the templating for cilium to think the control-plane is not HA and thus change the replica settings.

Probably in this code:

if tf.HasHighlyAvailableControlPlane() {

I understand that the introduction of --instance-group-role targeting for kops update was introduced in 1.31 due to the upstream changes in Kubernetes and is a fairly recent addition to solve a specific issue (enable the reconcile targeting of control-plane first).

We have some automation around kops for our own updates. We do use instance-group and instance-group-role targeting to roll out our change in a controlled manner.

Comparing our (1.31 adjusted) workflow to kOps reconcile yields on simple difference:

  • reconcile does update control-plane and than update all
  • we do update control-plane and than update nodes

While it is easy to fix our side to work like reconcile, it makes me wonder:
What is the expectation from kOps side here? Is it expected for kops update --instance-group-role to work without side-effects with just node role or is it not expected to just target nodes?

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants