Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync MTU and check MTU valid values #149

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

w13915984028
Copy link
Member

@w13915984028 w13915984028 commented Jan 30, 2025

Problem:

Change of MTU on VlanConfig uplink is not updated to NAD.

Solution:

  1. Check the input MTU on VlanConfig
  2. Update the MTU to labels of ClusterNetwork when change happens
  3. Sync NAD's MTU with the lables of ClusterNetwork when change happens

This solution does not change the APIs and UI, instead it propagates MTU from VlanConfig to ClusterNetwork when change happens; and makes sure all the VlanConfigs under the same ClusterNetwork will have the same MTU.

Related Issue:
harvester/harvester#4355
harvester/harvester#4752

HEP: harvester/harvester#6385
Doc PR: harvester/docs#640

Test plan:

  1. MTU propogation
    1.1 Create a clusternetwork
    1.2 create a vlanconfig, the MTU from vlanconfig is synced to clusternetwork
    1.3 create another vlanconfig, if MTU is different, it is denied by webhook
    1.4 create nad, the MTU is inherited from clusternetwork/vlanconfig

  2. Change the MTU of a clusternetwork
    2.1 Stop all VMs attached to a specific clusternetwork
    2.2 Delete/Migrate all vlanconfigs until there is a last one; otherwise change MTU from any will be denied per 1.3
    2.3 Change the last vlanconfig's MTU, then it is synced to clusternetwork; if there are any NADs attached to this clusternetwork, their MTU is changed automatically
    2.4 Add more vlanconfig, each needs to fill the same MTU per 1.3

  3. Change the MTU of a mgmt network
    3.1 The mgmt network has no vlanconfig instances. You may edit the MTU annotation manually, be sure this really matchs the NICs' MTU, there is no strong check at the moment.
    3.2 If there are VMIs or storagenetwork on mgmt network, the change is denied.

More detailed tests about network related enhancements:

#149 (comment)

@w13915984028
Copy link
Member Author

@mergify backport v0.5.x v0.6.x

Copy link

mergify bot commented Jan 30, 2025

backport v0.5.x v0.6.x

🟠 Waiting for conditions to match

  • merged [📌 backport requirement]

@@ -264,8 +270,8 @@ func (v *Validator) validateMTU(current *networkv1.VlanConfig) error {
if vc.Name == current.Name {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to change it to just fetch the related cluster object crd and check the MTU from there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few considerations:

  1. The current code has assumption that the clusternetwork may be created behind the vlanconfig, for logacy or other reasons

    func (h Handler) ensureClusterNetwork(name string) error {

  2. The MTU is synced to clusternetwork as a label, in extreme cases, if user hack it with an invalid value, the check may miss.

@w13915984028 w13915984028 force-pushed the enh4355 branch 2 times, most recently from 5b393ff to dd757ea Compare January 31, 2025 08:26
@w13915984028 w13915984028 force-pushed the enh4355 branch 2 times, most recently from fd2db83 to 560c8b7 Compare January 31, 2025 08:55
@rrajendran17
Copy link
Contributor

Few points
1.Similar to vms,if MTU is updated after storage network configuration, it will be propagated to storage network nad, but LH pods interfaces will not have updated the new MTU.
A webhook check in needed to prevent updating the MTU when storage network is configured. The user has to disable storage network config before MTU change and enable storage network configuration after a new MTU change.
2.In case of storage network, the vlan network under the vlan config also has to be deleted if vlan config has to be deleted for any mismatch in MTU cases.Just a point to be noted while adding test cases.
3.Good to include a manual traffic test after MTU change in the test plan(either between VMs or LH pods with jumbo frames)

@w13915984028
Copy link
Member Author

w13915984028 commented Mar 4, 2025

With a lot of enhancements on ClusterNetwork, VlanConfig, NAD and StorageNetwork, a detailed test plan is listed.

test plan:

ClusterNetwork: none-mgmt

Create: 

(1) Create a clusternetwork with MTU annotation: 
  can't create cluster network cntst because label network.harvesterhci.io/uplink-mtu can't be added

Update:

(1) Update a clusternetwork's MTU annotation to another valid value, when there are at least 1 vlanconfig
  can't update cluster network cn3 because clusternetwork has MTU 1430, but the vlanconfigs vc-3-1 has another MTU 1420

(2) Update a clusternetwork's MTU annotation to another valid value, when there is no vlanconfig; is Okay, but when later vlanconfig is added, this value is amended by vlanconfig
  network.harvesterhci.io/uplink-mtu: '1550'

Delete:

(1) A clusternetwork has vlanconfigs
  can't delete cluster network cn4 because vlanconfig(s) [vc-2-1] under this clusternetwork are still existing

(2) A clusternetwork has no vlanconfigs but NADs (NAD is not blocking the deletion of VlanConfig)
  can't delete cluster network cn4 because nads(s) default/nad-2-1 under this clusternetwork are still existing


ClusterNetwork: mgmt

you can't create or delete mgmt clusternetwork

Update

(1) create a NAD, no VM, change mgmt MTU annotation
  OK, mtu is updated to nad: {"cniVersion":"0.3.1","name":"mgmt-nad-1","type":"bridge","bridge":"mgmt-br","promiscMode":true,"vlan":11,"ipam":{},"mtu":1480}

(2) create a NAD, no VM, change mgmt MTU annotation to invalide value
  can't update cluster network mgmt because annotation 1470a value is not an integer, error strconv.Atoi: parsing "1470a": invalid syntax

(3) create a NAD, run a VM, change mgmt MTU annotation to another valid value
  can't update cluster network mgmt because the MTU can't be changed from 1480 to 1490 as following VMs must be stopped at first: default/vm2






note: vlanconfig can only be added to none-mgmt cluster network

VlanConfig: Create

(1) Create the first VlanConfig with an input/default MTU:  
  The MTU is propogate to it's parent clusternetwork's annotation `network.harvesterhci.io/uplink-mtu: '1350'`
  If there are dangling NADs, NADs' MTU are updated

(2) Create a second VlanConfig with a different MTU
  can't create vlanConfig cn-3-2 because the vlanconfig cn-3-2 MTU 1600 is different with another vlanconfig vc-3-1 MTU 0, all vlanconfigs on one clusternetwork need to have same MTU

(3) Create a second VlanConfig with same MTU, OK


VlanConfig: Update

(1) Update: via kubectl, set spec.clusterNetwork to a none-existing CN: (note: From WebUI, the action is `Migrate`, from one clusternetwork to another, the destination CN must be existing)
  can't update vlanConfig vc-2-2 because it refers to a none-existing cluster network cn8 or error clusternetworks.network.harvesterhci.io "cn8" not found;  

(2) Update: via kubectl, set spec.clusterNetwork to `mgmt` CN: 
  can't update vlanConfig vc-2-2 because cluster network can't be mgmt;   

(3.1) Update uplink-mtu to 1440 when there is only 1 vlanconfig; no running VMs
  The updated MTU is propotaged to cluster network annotations: network.harvesterhci.io/uplink-mtu: "1440"
  All NADs under this cluster network are also updated to:
     {"cniVersion":"0.3.1","name":"nad-cn3-3","type":"bridge","bridge":"cn3-br","promiscMode":true,"vlan":300,"ipam":{},"mtu":1440}
     {"cniVersion":"0.3.1","name":"nad-cn3-2","type":"bridge","bridge":"cn3-br","promiscMode":true,"vlan":200,"ipam":{},"mtu":1440}
     {"cniVersion":"0.3.1","name":"nad-cn3-1","type":"bridge","bridge":"cn3-br","promiscMode":true,"vlan":100,"ipam":{},"mtu":1440}

(3.2) Update uplink-mtu to 1440 when there is only 1 vlanconfig; with running VMs
  can't update vlanConfig vc-3-1 because it is blocked by VM(s) default/vm3 which must be stopped at first

(4) Update uplink-mtu to 1450  when there are > 1 vlanconfigs
  can't update vlanConfig vc-3-2 because the vlanconfig vc-3-2 MTU 1450 is different with another vlanconfig vc-3-1 MTU 1440, all vlanconfigs on one clusternetwork need to have same MTU

(5.1) Update selector to expand node (no overlap), OK, no error, the annotation is updated, new node `harv2` is added
   network.harvesterhci.io/matched-nodes: '["harv2","harv41"]'

(5.1) Update selector to reduce node (not affecting VMIs), OK, no error, the annotation is updated, node `harv2` is removed
   network.harvesterhci.io/matched-nodes: '["harv41"]'

(5.2) Update selector to replace current node: running VMIs are affected
  can't update vlanConfig vc-3-1 because it is blocked by VM(s) default/vm3 which must be stopped at first

(6) Migrate to another CN when VMIs are running on related nodes
  can't update vlanConfig vc-3-1 because it is blocked by VM(s) default/vm3 which must be stopped at first  


(7) Update uplink (e.g. nics) when there storage network configured

  can't update vlanConfig cn3-2 because the storage network nad storagenetwork-k8trh is still attached



VlanConfig: Delete

(1) No running VMs, VlanConfig can be deleted, the NADs may be put annotation `network.harvesterhci.io/ready: 'false'` when all VlanConfigs are gone
  This is allowed, as user may re-build the VlanConfig to set other NICs/MTUs...

(2) With running VMs
  can't delete vlanConfig vc-3-1 because it is blocked by VM(s) default/vm3 which must be stopped at first

(3) No running VMs, with storagenetwork
  can't delete vlanConfig cn3-2 because the storage network nad storagenetwork-k8trh is still attached




NAD: Create

(1) create a NAD, refer to a none-existing clusternetwork
  can't create nad default/nad-cn5-1 because clusternetworks.network.harvesterhci.io "cn5" not found

(2) create a NAD, set MTU to a different value than clusternetwork
  the MTU is amended to same value as clusternetwork (via mutator)


NAD: Update

NAD-on none-mgmt network: Update
(1) create a NAD, run a VM, update nad
  can't update nad default/nad-cn3-1 because it's still used by VM(s) default/vm3 which must be stopped at first"]

(2) create a NAD, run a VM, stop VM, update nad: change MTU
  can't update nad default/nad-cn3-3 because nad MTU 1430 does not match cluster network MTU 1420

(3) create a NAD, run a VM, stop VM, update nad: change bridge name
  can't update nad default/nad-cn3-1 because nad bridge name can't be changed from cn3-br to cn4-br"


storagenetwork on none-mgmt network:

(1) Update storagenetwork nad's VID via `kubectl edit NetworkAttachmentDefinition.k8s.cni.cncf.io -n harvester-system`
  can't update nad harvester-system/storagenetwork-k8trh because it is used by storagenetwork



storagenetwork on mgmt network:
(1) edit storagenetwork nad directly via `kubectl edit NetworkAttachmentDefinition.k8s.cni.cncf.io -n harvester-system storagenetwork-c9brt`
  can't update nad harvester-system/storagenetwork-c9brt because nad MTU 1480 does not match cluster network MTU 1470

(2) edit clusternetwork mgmt via `kubectl edit clusternetwork mgmt`
  can't update cluster network mgmt because the MTU can't be changed from 1470 to 1480 as storage network nad storagenetwork-c9brt is still attached




NAD: Delete

NAD-on mgmt network: Delete

(1) create a NAD, run a VM, delete nad
  can't delete nad default/mgmt-nad-1 because it's still used by VM(s) default/vm2 which must be stopped at first

(2) create a NAD, stop a VM, delete nad
  can't delete nad default/mgmt-nad-1 because it's still used by VM(s) default/vm2 which must remove the related networks and interfaces

NAD on none-mgmt network: Delete

(1) create a NAD, run a VM, delete nad
  can't delete nad default/nad-cn3-1 because it's still used by VM(s) default/vm3 which must be stopped at first

(2) create a NAD, stop a VM, delete nad
  can't delete nad default/nad-cn3-1 because it's still used by VM(s) default/vm3 which must remove the related networks and interfaces


storagenetwork on mgmt network:
(1). Disable storagenetwork from Harvester UI, it works. 
  The storagenetwork NAD is deleted by the Harvester controller successfully.


storagenetwork on none-mgmt network:
(1) Update storagenetwork nad via `kk edit NetworkAttachmentDefinition.k8s.cni.cncf.io -n harvester-system`
  can't update nad harvester-system/storagenetwork-k8trh because it is used by storagenetwor

Enforce the validator webhook
Add test code for all cases with a couple of fake clients
Differentiate mgmt network as it has no VlanConfig
Add processing of StorageNetwork
Fix some minor bugs
@w13915984028
Copy link
Member Author

@ibrokethecloud @ihcsim @mschiu77 @starbops @rrajendran17

This PR is ready for final review, thanks. This commit ef611ee excludes the vendor bump.

With a lot of enhancements on ClusterNetwork, VlanConfig, NAD, StorageNetwork, a detailed test plan is listed #149 (comment).

cc @bk201

@@ -83,16 +83,49 @@ func (h Handler) SetClusterNetworkUnready(_ string, vs *networkv1.VlanStatus) (*
return vs, nil
}

func (h Handler) ensureClusterNetwork(name string) error {
if _, err := h.cnCache.Get(name); err != nil && !apierrors.IsNotFound(err) {
func (h Handler) ensureClusterNetwork(vc *networkv1.VlanConfig) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we sync the MTU from the vlanconfig to the clusternetwork. due to additional logic in the validationg webhook we only allow the MTU set on the clusternetwork to be applied to subsequent vlansconfigs. does this mean the only way to change the MTU is remove all vlanconfigs / cluster network and recreate the objects again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is described in PR description:

2. Change the MTU of a clusternetwork
2.1 Stop all VMs attached to a specific clusternetwork
2.2 Delete/Migrate all vlanconfigs until there is a last one; otherwise change MTU from any will be denied per 1.3
2.3 Change the last vlanconfig's MTU, then it is synced to clusternetwork; if there are any NADs attached to this clusternetwork, their MTU is changed automatically
2.4 Add more vlanconfig, each needs to fill the same MTU per 2.3

This will forcely ensure it, to avoid user changes some of them accidentally and spend a big effort to finally figure things out.

return nil
}

// for none-mgmt cluster network, this annotation can only be operated by controller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor typo

Suggested change
// for none-mgmt cluster network, this annotation can only be operated by controller
// for non-mgmt cluster network, this annotation can only be operated by controller


// Get the vm name list who uses a group of nads,
// note: duplicated names are removed
func (v *VMGetter) VMNamesWhoUseNads(nads []*nadv1.NetworkAttachmentDefinition) ([]string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find this method used anywhere

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is prepared for func (c *CnValidator) Delete(_ *admission.Request, oldObj runtime.Object) error { at beginning, and not used per

https://github.com/w13915984028/network-controller-harvester/blob/4db16c13a674513a3103d0d4153a0a32515fc98e/pkg/webhook/clusternetwork/validator.go#L110

// no need to check vmi, vm, both of them need to be stopped &/ removed related interfaces/networks when deleting nads

It is still kept to let vm.go is in same fashion with vmi.go

Comment on lines +105 to +110
if cnt == 1 {
return generateVmiNameList(vmis), nil
}

// use mapset to remove duplicated names
return mapset.NewSet[string](generateVmiNameList(vmis)...).ToSlice(), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a specific reason to check cnt == 1 ? we could just return the mapset here

Suggested change
if cnt == 1 {
return generateVmiNameList(vmis), nil
}
// use mapset to remove duplicated names
return mapset.NewSet[string](generateVmiNameList(vmis)...).ToSlice(), nil
return mapset.NewSet[string](generateVmiNameList(vmis)...).ToSlice(), nil

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is only 1 element, surely it has no duplicated names. The mapset.NewSet... is a heavy object & operation. The code tries to call it when necessary.

On the other hand, I am still thinking if we should only return the first e.g. 10 names, some cluster runs 300+ VMs, expand all the names may overflow the warning window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants