Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

networkmanager: fix primary bond option #21533

Merged
merged 3 commits into from
Feb 12, 2025

Conversation

champtar
Copy link
Contributor

@champtar champtar commented Jan 17, 2025

  • allow setting also for balance-tlb / balance-alb modes
  • read the current value
  • remove when not set (it can't be set to an empty string)
  • remove when switching to a non compatible mode

This fixes #21528

Patch tested on top of version 323 (~EL 9.5)

@martinpitt
Copy link
Member

Hello @champtar, thanks for fixing this! This needs to be covered by an integration test to reproduce/demonstrate the original bug and validate the fix. If you would like to give this a try, please have a look at https://github.com/cockpit-project/cockpit/blob/main/test/README.md

Otherwise someone from our team (@mvollmer or me) can help with that. However, we have a meeting week, so nothing will happen until the week afterwards. We are not ignoring you 😉

@champtar
Copy link
Contributor Author

Hello @martinpitt, not sure I will have time to look at the tests this week either :(

@Venefilyn
Copy link
Contributor

Venefilyn commented Jan 27, 2025

I wanted to learn a bit more about integration tests so tried creating a test for this. ~~~Couldn't get it passing locally though for some reason as it errors with the selection~~~

The patch works, tested it with a VM 🎉

It should create a new bond with bond0 name, one interface added that is made primary, then the bond edited to make sure it is selected as primary

git patch
diff --git a/test/verify/check-networkmanager-bond b/test/verify/check-networkmanager-bond
index 8ec09b950..1fdaa9fdc 100755
--- a/test/verify/check-networkmanager-bond
+++ b/test/verify/check-networkmanager-bond
@@ -223,6 +223,38 @@ class TestBonding(netlib.NetworkCase):
         b.wait_visible(f"#network-interface-members tr[data-interface='{iface}']")
         b.wait_in_text("#network-interface .pf-v5-c-card:contains('tbond')", ip)
 
+    def testPrimary(self):
+        b = self.browser
+
+        self.login_and_go("/network")
+
+        b.wait_visible("#networking")
+        iface = "cockpit"
+        self.add_veth(iface, dhcp_cidr="10.111.112.2/20")
+        self.nm_activate_eth(iface)
+        self.wait_for_iface(iface)
+
+        # Create a bond
+        b.click("button:contains('Add bond')")
+        b.wait_visible("#network-bond-settings-dialog")
+        b.set_input_text("#network-bond-settings-interface-name-input", "tbond")
+        # add interface to bond
+        b.set_checked(f"input[data-iface='{iface}']", val=True)
+        # select it as primary
+        b.select_from_dropdown("#network-bond-settings-primary-select", iface)
+        # finish creating the bond
+        b.click("#network-bond-settings-dialog button:contains('Add')")
+        b.wait_not_present("#network-bond-settings-dialog")
+
+        # Navigate to bond and edit it
+        b.click("#networking-interfaces tr[data-interface='tbond'] button")
+        b.wait_visible("#network-interface")
+        b.click("#network-interface #networking-edit-bond")
+
+        # edit bond dialog
+        b.wait_visible("#network-bond-settings-dialog")
+        b.wait_val("#network-bond-settings-primary-select", iface)
+
     def testAmbiguousMember(self):
         b = self.browser
         m = self.machine

@mvollmer
Copy link
Member

It should create a new bond with bond0 name, one interface added that is made primary, then the bond edited to make sure it is selected as primary

Thanks! From the description of the pull request, I think there are a few more cases that could be tested: removing the primary, and automatic removal when switching to a mode that doesn't have the concept of a primary interface.

But we can install the test as it is and run our CI, and see whether our coverage report discovers the missing cases as well. :-)

@champtar, do you want to add @Venefilyn's patch, or do you want one of the Cockpit maintainers to take over?

@champtar
Copy link
Contributor Author

It should create a new bond with bond0 name, one interface added that is made primary, then the bond edited to make sure it is selected as primary

Thanks! From the description of the pull request, I think there are a few more cases that could be tested: removing the primary, and automatic removal when switching to a mode that doesn't have the concept of a primary interface.

I can think of:

  1. When the page loads, the current primary value is correctly displayed (no primary or one of the interfaces)
  2. can switch from no primary to one of the interface of the bond
  3. can switch primary from one of the interface to another
  4. can change primary setting when current primary value is invalid (I managed to set primary=- at some point)
  5. can switch to a mode that doesn't support primary

But we can install the test as it is and run our CI, and see whether our coverage report discovers the missing cases as well. :-)

@champtar, do you want to add @Venefilyn's patch, or do you want one of the Cockpit maintainers to take over?

Please take over, maintainers should be able to push in this PR directly

@mvollmer
Copy link
Member

Nice, diff coverage is 100%: https://cockpit-logs.us-east-1.linodeobjects.com/pull-21533-5a3107ef-20250130-150909-fedora-41-devel/Coverage/lcov/github-pr.diff.gcov.html

@champtar
Copy link
Contributor Author

champtar commented Feb 3, 2025

Do I need to rebase or do anything here ? The failures seem unrelated

champtar and others added 2 commits February 6, 2025 14:04
- allow setting also for balance-tlb / balance-alb modes
- read the current value
- remove when not set (it can't be set to an empty string)
- remove when switching to a non compatible mode
mvollmer
mvollmer previously approved these changes Feb 12, 2025
Copy link
Member

@mvollmer mvollmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. tests looking good!

@mvollmer
Copy link
Member

mvollmer commented Feb 12, 2025

Thanks. tests looking good!

(Except the failures of course, I'll figure them out.)

@mvollmer
Copy link
Member

The ubuntu failures are probably due to this journal entry:

Feb 10 08:43:17 ubuntu NetworkManager[9361]: /etc/netplan/90-NM-861a710b-9858-48f4-b5e5-a770ab67bd1f.yaml:13:18: Error in network definition: tbond: interface 'cockpit' is not defined
Feb 10 08:43:17 ubuntu NetworkManager[9361]:         primary: "cockpit"
Feb 10 08:43:17 ubuntu NetworkManager[9361]:                  ^

The message comes from "netplan apply":

# netplan apply
[...]
/etc/netplan/90-NM-4883fb46-4315-41f1-b945-bb096768da61.yaml:13:18: Error in network definition: tbond: interface 'cockpit' is not defined
        primary: "cockpit"
                 ^
# echo $?
78

NetworkManager seems to ignore the failure, and somehow the "tbond" interface is created anyway, but the "bond.options" setting of its connection doesn't include the "primary" attribute:

# nmcli con show tbond | grep ^bond
bond.options:                           mode=active-backup,downdelay=0,miimon=100,updelay=0

I have no idea what the relationship between netpan and NetworkManager is. I am guessing that NetworkManager is writing the yaml files in /etc/netplan/.

The problem seems to be that the definition of the "cockpit" interface uses a generated name:

network:
  version: 2
  ethernets:
    NM-3287d9ea-ffda-49df-9f21-27594d7cc1b8:
      renderer: NetworkManager
      match:
        name: "cockpit"
      wakeonlan: true
      networkmanager:
        uuid: "3287d9ea-ffda-49df-9f21-27594d7cc1b8"
        name: "cockpit"
        passthrough:
          connection.controller: "tbond"
          connection.master: "tbond"
          connection.port-type: "bond"
          connection.slave-type: "bond"
          ethernet._: ""

If I change this to

network:
  version: 2
  ethernets:
    cockpit:
      renderer: NetworkManager
      match:
        name: "cockpit"
[...]

then the error disappears and the correct primary is displayed in Cockpit.

I'll file this upstream and get it tracked for us with a "naughty".

@mvollmer
Copy link
Member

@mvollmer mvollmer added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Feb 12, 2025
@mvollmer mvollmer added blocked Don't land until something else happens first (see task list) and removed no-test For doc/workflow changes, or experiments which don't need a full CI run, labels Feb 12, 2025
@mvollmer
Copy link
Member

Blocked by cockpit-project/bots#7433

@mvollmer mvollmer removed the blocked Don't land until something else happens first (see task list) label Feb 12, 2025
@mvollmer
Copy link
Member

Green enough, thanks everyone!

@mvollmer mvollmer merged commit 66625ee into cockpit-project:main Feb 12, 2025
85 of 87 checks passed
@champtar champtar deleted the bond-primary branch February 12, 2025 16:29
@champtar
Copy link
Contributor Author

Is there any change this bugfix will go in EL 9.5 or 9.6 ?
(I can open a RedHat case if it helps)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Primary interface not displayed in Active Backup bond interface
4 participants