Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle bridge vlan deletion during NAD update #155

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rrajendran17
Copy link
Contributor

@rrajendran17 rrajendran17 commented Feb 22, 2025

Problem:
When updating the VM VLAN Network for fields like cluster network or vlan, the corresponding vlan is added to the new cluster network, but not removed from old cluster network.

a.Network controller skips deletion of vlan from bridge (removeLocalArea) when LastKeyNetworkType is empty
b.bug in checking the length of existing nad count in nad cache

Solution:
LastKeyNetworkType could be empty but there could be a cluster network/vlan id update.

Related Issue:
harvester/harvester#7676

Test plan:

case 1:
a.create cluster1 ->vlanconfig1 on node1
b.create cluster2 -> vlanconfig2 on node2
c.create NAD1 with vlan 2013 in cluster1
d.create NAD2 with vlan 2013 in cluster2

Node 1: output of "bridge vlan show"
should show 2013 vlan part of cluster-1

Node 2: output of "bridge vlan show"
should show 2013 vlan part of cluster-2

e.Update the NAD2 from cluster2 to cluster1

Expected Behavior:
Node 1: output of "bridge vlan show"
should show 2013 vlan part of cluster-1

Node 2: output of "bridge vlan show"
should not show 2013 vlan part of cluster-2

as vlan 2013 is not part of cluster-2 anymore.

case 2:
a.Update NAD2 to cluster 2 again

Node 1: output of "bridge vlan show"
should show 2013 vlan part of cluster-1

Node 2: output of "bridge vlan show"
should show 2013 vlan part of cluster-2

case 3:
a.Update the vlan id of NAD1 to 2012

Node 1: output of "bridge vlan show"
should show 2012 vlan part of cluster-1
should not show 2013 vlan part of cluster-1

Node 2: output of "bridge vlan show"
should show 2013 vlan part of cluster-2

case 4:
a.Update the vlan id of NAD2 to 2012

Node 1: output of "bridge vlan show"
should show 2012 vlan part of cluster-1
should not show 2013 vlan part of cluster-1

Node 2: output of "bridge vlan show"
should show 2012 vlan part of cluster-2
should not show 2013 vlan part of cluster-2

case 5:
a.Update the vlan id of NAD1 back to 2013
b.Update the vlan id of NAD2 back to 2013

Node 1: output of "bridge vlan show"
should show 2013 vlan part of cluster-1

Node 2: output of "bridge vlan show"
should show 2013 vlan part of cluster-2

The following PR should take care of avoiding NAD update/deletion on VMs (not in running state) using those NADs.
#131

@rrajendran17
Copy link
Contributor Author

This fix uncovers another issue, Example scenario

mgmt interface is part of custom vlan (for example 2011, mgmt-br and mgmt-bo will be part of 2011).
a.create a NAD NAD1 with vlan id 2011 in mgmt cluster
b.step a adds mgmt-bo to 2011 again
c.Now update the NAD1 to non-mgmt cluster
d.step c will remove 2011 from mgmt-bo which should not happen as 2011 should be always part of mgmt-bo.

Need to find a solution to prevent above scenario.

@w13915984028
Copy link
Member

@rrajendran17 What do you think that we deny the switching of clusternetwork ?

The NAD's clusternetwork is converted to VM's affinity; even when VMs are off, change the NAD's clusternetwork is hard to backfill to VM's affinity.

At the moment, this looks to be an implicit rule.

    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: network.harvesterhci.io/mgmt
                    operator: In
                    values:
...
      networks:
        - name: default
          pod: {}
        - multus:
            networkName: default/vmn33
          name: nic-1

@rrajendran17
Copy link
Contributor Author

@rrajendran17 What do you think that we deny the switching of clusternetwork ?

The NAD's clusternetwork is converted to VM's affinity; even when VMs are off, change the NAD's clusternetwork is hard to backfill to VM's affinity.

At the moment, this looks to be an implicit rule.

    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: network.harvesterhci.io/mgmt
                    operator: In
                    values:
...
      networks:
        - name: default
          pod: {}
        - multus:
            networkName: default/vmn33
          name: nic-1

@w13915984028 I think it is ok for the users to change the cluster network on a NAD before the VM creation.
On already created VMs, #131 this PR should take are of preventing any update/delete on NAD objects.

@rrajendran17
Copy link
Contributor Author

rrajendran17 commented Feb 24, 2025

This fix uncovers another issue, Example scenario

mgmt interface is part of custom vlan (for example 2011, mgmt-br and mgmt-bo will be part of 2011). a.create a NAD NAD1 with vlan id 2011 in mgmt cluster b.step a adds mgmt-bo to 2011 again c.Now update the NAD1 to non-mgmt cluster d.step c will remove 2011 from mgmt-bo which should not happen as 2011 should be always part of mgmt-bo.

Need to find a solution to prevent above scenario.

currently we do not act on mgmt cluster network during NAD updates, so this PR changes are good to go.

I am working on harvester/harvester#7650, which will also consider mgmt cluster network for NAD updates.I will handle the above issue mentioned with that PR

@w13915984028
Copy link
Member

w13915984028 commented Feb 25, 2025

I added some enhancments on PR #149.

In this commit 9f06c54#diff-900397cb9002ed992e8565ae266dc4ec231a187b1d3cf016f5fa0cd6ebf82d04, I add two additional checks on nad webhook:

checkClusterNetwork: nad can't refer to a none-existing clusternetwork

checkClusterNetworkUnchanged: nad can't change its referring clusternetwork, networktype

In this commit 7dfe2d6#diff-900397cb9002ed992e8565ae266dc4ec231a187b1d3cf016f5fa0cd6ebf82d04

checkStorageNetwork: storagenetwork nad can't be changed or deleted by user

For update nad scenario, #131 needs to rework, we can't request user to stop & delete VM /VM's interface just for change the nad spec like vlan id (this is a break change). Instead, stop VMs and then modify nad is okay.

On the other hand, for change like nad's referred clusternetwork & networktype, the better solution is to deny it. In practice, this happens rarely in production, as clusternetwork is an abstraction of physical network, the VM's interfaces and related networks are planned in advance, not randomly set/change the clusternetwork.

And, Harvester does not have a NAD controller, which can reflect the nad's change of clusternetwork to VM's affinity. I will raise an issue for this. harvester/harvester#7701

@rrajendran17
Copy link
Contributor Author

rrajendran17 commented Feb 25, 2025

I added some enhancments on PR #149.

In this commit 9f06c54#diff-900397cb9002ed992e8565ae266dc4ec231a187b1d3cf016f5fa0cd6ebf82d04, I add two additional checks on nad webhook:

checkClusterNetwork: nad can't refer to a none-existing clusternetwork

checkClusterNetworkUnchanged: nad can't change its referring clusternetwork, networktype

In this commit 7dfe2d6#diff-900397cb9002ed992e8565ae266dc4ec231a187b1d3cf016f5fa0cd6ebf82d04

checkStorageNetwork: storagenetwork nad can't be changed or deleted by user

For update nad scenario, #131 needs to rework, we can't request user to stop & delete VM /VM's interface just for change the nad spec like vlan id (this is a break change). Instead, stop VMs and then modify nad is okay.

On the other hand, for change like nad's referred clusternetwork & networktype, the better solution is to deny it. In practice, this happens rarely in production, as clusternetwork is an abstraction of physical network, the VM's interfaces and related networks are planned in advance, not randomly set/change the clusternetwork.

And, Harvester does not have a NAD controller, which can reflect the nad's change of clusternetwork to VM's affinity. I will raise an issue for this. harvester/harvester#7701

True, but the use case I am thinking more from a scenario where a NAD is created and not attached to any VMs, and user by mistake attached to a cluster network and wanted to update it to an another cluster network.I think this should at least work as user does not have to delete and recreate the NAD for any update.

I think it will not be right to deny it all the time.
It is something like update is allowed if there are no VMs using it and do not allow it if its already attached to a VM.
Also this code has been allowing updates, just that there was a leak where vlans are not removed from bridge due to a some small bug in the code when cluster network is changed.

@starbops
Copy link
Member

It is something like update is allowed if there are no VMs using it and do not allow it if its already attached to a VM.

I agree. Users should be able to change NAD's associated ClusterNetwork only when no VMs are attached to it. We should not block such an action entirely for no reason.

@@ -119,7 +119,7 @@ func (h Handler) existDuplicateNad(vlanIDStr, cn string) (bool, error) {
return false, err
}

return len(nads) > 1, nil
return len(nads) > 0, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we need to change this. Maybe I lost the context. Can you add some comment to it? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check should be even if atleast 1 NAD exists in the same cluster network and vlan id which is going to be deleted, then we should skip deleting the local area network as we update the NAD with new cluster network(ensureLabels) even before this check.

Example scenario:
1.NAD1 - vlan 2013 is part of cluster-1 and NAD2 - vlan 2013 is part of cluster-2
2.Update NAD2 to cluster-1
3.No changes to NAD1, but NAD2 will be part of cluster-1 now and bridge vlans also changed accordingly.
4.Now again update NAD2 to cluster-2
NAD2 will be part of cluster-2 again, but 2013 part of cluster-1 should not be removed from bridge vlan as NAD1 is still referencing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know if I am missing any other scenarios.Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The back & forth moving a/a group of NAD(s) from/to other clusternetworks frequently, may cause the agent in abnormal status.

e.g., One NAD is changed from cn1, to cn2, to cn3, to cn4 continuously, the client onChange may not get all the change in sequences, it may get cn1->cn3->cn4, or cn1->cn2->cn4 .., but I did not test to get the full log yet.

I add this example as a scenario for harvester/harvester#7701

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @w13915984028 , I hope this sequence is possible with vlan id and network type changes as well.

Copy link
Member

@w13915984028 w13915984028 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some questions, thanks.

@@ -119,7 +119,7 @@ func (h Handler) existDuplicateNad(vlanIDStr, cn string) (bool, error) {
return false, err
}

return len(nads) > 1, nil
return len(nads) > 0, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the consideration to change from 1 to 0, to confirm the duplicatedNad?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, as mentioned in this previous response
#155 (comment)

Copy link
Contributor Author

@rrajendran17 rrajendran17 Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@starbops @w13915984028 I have removed this changes as existDuplicateNad() is called from removeLocalArea which is called during both OnChange() and OnRemove() of nad.

case of cluster network update:
OnChange() needs to handle existDuplicateNad() checking for len(nads) > 0 - check if atleast 1 nad exists with same vlan-id and cluster network name to avoid deletion

OnRemove() needs to handle existDuplicateNad() checking for len(nads) > 1 - avoid deletion until other references of the same vlan-id/cluster network present.

If we are going to avoid cluster network updates from webhook as Jian is taking care in another PR, then the change I introduced in existDuplicateNad() is not required.

@@ -119,7 +119,7 @@ func (h Handler) existDuplicateNad(vlanIDStr, cn string) (bool, error) {
return false, err
}

return len(nads) > 1, nil
return len(nads) > 0, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The back & forth moving a/a group of NAD(s) from/to other clusternetworks frequently, may cause the agent in abnormal status.

e.g., One NAD is changed from cn1, to cn2, to cn3, to cn4 continuously, the client onChange may not get all the change in sequences, it may get cn1->cn3->cn4, or cn1->cn2->cn4 .., but I did not test to get the full log yet.

I add this example as a scenario for harvester/harvester#7701

@rrajendran17
Copy link
Contributor Author

This changes takes are of deleting the vlan from the corresponding bridge/cluster-network when Network type and vlan-id is changed on the NAD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants