Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat(cv_deploy): Add verification of the duplicated devices #4889

Open
wants to merge 8 commits into
base: devel
Choose a base branch
from

Conversation

alexeygorbunov
Copy link
Contributor

@alexeygorbunov alexeygorbunov commented Jan 16, 2025

Change Summary

Add new functionality to validate if structured config of the hosts processed by cv_deploy role contains overlapping serial_number or metadata.system_mac_address values.

Related Issue(s)

N/A

Component(s) name

arista.avd.cv_deploy

Proposed changes

  • Add additional logic to existing deploy_to_cv workflow to analyze serial_number and metadata.system_mac_address values (based on their structured_config files generated by eos_designs) of all targeted hosts.
  • Introduce additional role's variable cv_tolerate_duplicated_devices to control desired behavior of the role if duplicated devices are discovered. Default cv_tolerate_duplicated_devices: true allows workflow to continue even if duplicated devices are discovered. Setting cv_tolerate_duplicated_devices to false instructs role to terminate role's execution when duplicated devices are discovered.

How to test

  • Run cv_deploy molecule test:
CVAAS_AAWG_CI='<CVaaS_AT>' molecule test -s cv_deploy
  • Analyze results of the 3 new molecule workflows:
    • no_duplicated_devices.yml tests regular cv_deploy with serial_number and metadata.system_mac_address values being uniquely set per targeted device

    • duplicated_devices_with_tolerance.yml tests cv_deploy with overlapping serial_number and metadata.system_mac_address values and default value of cv_tolerate_duplicated_devices (True)

      The following warnings are displayed during the execution:

      TASK [arista.avd.cv_deploy : Deploy device configurations and tags to CloudVision] ***
      [WARNING]: [pyavd] - verify_devices_on_cv: Devices with duplicated
      serial_number discovered in inventory (structured config):
      {'problematic_serial_number': '13c20f1edcced2d85f6db2fb9e3ac5b6',
      'overlapping_inventory_hostnames': ['avd-ci-leaf1', 'avd-ci-leaf2']}
      [WARNING]: [pyavd] - verify_devices_on_cv: Devices with duplicated
      system_mac_address discovered in inventory (structured config):
      {'problematic_system_mac_address': '500000d7ee0b',
      'overlapping_inventory_hostnames': ['avd-ci-spine1', 'avd-ci-spine2']}

      Same warnings are exposed in cv_deploy_results:

      "errors": [],
      "failed": false,
      "warnings": [
      "('Duplicated devices found in inventory', {'problematic_serial_number': '13c20f1edcced2d85f6db2fb9e3ac5b6', 
      'overlapping_inventory_hostnames': ['avd-ci-leaf1', 'avd-ci-leaf2']}, {'problematic_system_mac_address': 
      '500000d7ee0b', 'overlapping_inventory_hostnames': ['avd-ci-spine1', 'avd-ci-spine2']})",
      "('Missing devices on CloudVision', CVDevice(hostname='avd-ci-spine2', 
      serial_number='A7D46613D44DE45D68D5B5C5CBA06B0D', system_mac_address='50:00:00:d7:ee:0b', 
      _exists_on_cv=None))"
        ],
    • duplicated_devices_without_tolerance.yml tests cv_deploy with overlapping serial_number and metadata.system_mac_address values and non-default value of cv_tolerate_duplicated_devices (False)

      Exception is raised during the execution of the task arista.avd.cv_deploy : Deploy device configurations and tags to CloudVision.

      Execution is stopped and details of the errors are exposed in cv_deploy_results:

      "errors": [
                  "('Duplicated devices found in inventory', {'problematic_serial_number': 
      '13c20f1edcced2d85f6db2fb9e3ac5b6', 'overlapping_inventory_hostnames': ['avd-ci-leaf1', 'avd-ci-leaf2']}, 
      {'problematic_system_mac_address': '500000d7ee0b', 'overlapping_inventory_hostnames': ['avd-ci-spine1', 'avd-ci- 
      spine2']})"
        ],
      "failed": true,
      "warnings": []

Checklist

User Checklist

  • N/A

Repository Checklist

  • My code has been rebased from devel before I start
  • I have read the CONTRIBUTING document.
  • My change requires a change to the documentation and documentation have been updated accordingly.
  • I have updated molecule CI testing accordingly. (check the box if not applicable)

@github-actions github-actions bot added type: documentation Improvements or additions to documentation state: CI Updated CI scenario have been updated in the PR state: Documentation role Updated labels Jan 16, 2025
Copy link

Review docs on Read the Docs

To test this pull request:

# Create virtual environment for this testing below the current directory
python -m venv test-avd-pr-4889
# Activate the virtual environment
source test-avd-pr-4889/bin/activate
# Install all requirements including PyAVD
pip install "pyavd[ansible] @ git+https://github.com/alexeygorbunov/avd.git@feature/verify_devices_on_cv_ex_pr_3957#subdirectory=python-avd" --force
# Point Ansible collections path to the Python virtual environment
export ANSIBLE_COLLECTIONS_PATH=$VIRTUAL_ENV/ansible_collections
# Install Ansible collection
ansible-galaxy collection install git+https://github.com/alexeygorbunov/avd.git#/ansible_collections/arista/avd/,feature/verify_devices_on_cv_ex_pr_3957 --force
# Optional: Install AVD examples
cd test-avd-pr-4889
ansible-playbook arista.avd.install_examples

@alexeygorbunov alexeygorbunov marked this pull request as ready for review January 16, 2025 21:08
@alexeygorbunov alexeygorbunov requested review from a team as code owners January 16, 2025 21:08
@@ -87,6 +88,7 @@ The `arista.avd.cv_workflow` module is an Ansible Action Plugin providing the fo
device_list: "{{ ansible_play_hosts }}"
# strict_tags: false
# skip_missing_devices: false
# tolerate_duplicated_devices: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would it be a good idea to tolerate duplicated devices? I know this would restrict existing deployments, but I see it more as a bug that we allow duplicate serials Today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, can't think of any valid scenario where one would on purpose use the same serial/sys_mac on multiple devices.
Default True is more to have this change as a non-breaking change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it more as a fix, so I am not sure we need this knob. We are just catching invalid inputs earlier instead of waiting until the workspace is built or the devices are broken because we push config to the wrong device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example usecase where we may want to have this knob present and allow duplicated items by default:

  • multiple devices, all with both serial_number and system_mac_address set
  • all devices have unique serial_numbers but some have overlapping system_mac_address
  • this would work OK with AVD+ CV by default as serial_number is prioritized (if set) as an identity attribute (so duplicated system mac would just be ignored by AVD)
  • blocking execution in such case (due to the duplicated system mac) would break a valid deployment (although with partially invalid inputs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's change this PR then, to only check for duplicate serials and not have a toggle for it. This is really what is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated system_mac_address may probably cause incorrect config to be pushed to the incorrect device as well. Potential scenario:

  • At least 2 devices with "serial_number" not set (in structured inputs) and with the same/duplicated "system_mac_address" set
  • Once we fetch found_devices (cv_client.get_inventory_devices) and start matching existing devices with found_devices - we'll update .serial_number attribute of both devices with real serial number (fetched from CV) matching that duplicated "system_mac_address" (address from the inputs)
  • At this point we'll have 2 devices with the same 'serial_number' and 'system_mac_address' but different hostnames and designed EOS config.
  • We can then end up with serial-based (for ID) containers/configlets on CV holding actual EOS configuration from one of the two raw EoS config files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but then we should never accept duplicate system mac either. It does not make sense to ignore something that could explode easily later.

python-avd/pyavd/_cv/workflows/verify_devices_on_cv.py Outdated Show resolved Hide resolved
python-avd/pyavd/_cv/workflows/verify_devices_on_cv.py Outdated Show resolved Hide resolved
python-avd/pyavd/_cv/workflows/verify_devices_on_cv.py Outdated Show resolved Hide resolved
python-avd/pyavd/_cv/client/exceptions.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added the state: conflict PR with conflict label Jan 24, 2025
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@ClausHolbechArista ClausHolbechArista added this to the v5.2.0 milestone Jan 27, 2025
Copy link

Conflicts have been resolved. A maintainer will review the pull request shortly.

@github-actions github-actions bot removed state: conflict PR with conflict type: documentation Improvements or additions to documentation labels Jan 28, 2025
Copy link
Contributor

@emilarista emilarista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested, LGTM. Error messages appear as expected for duplicated devices.

Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
6 Security Hotspots

See analysis details on SonarQube Cloud

@gmuloc gmuloc removed this from the v5.2.0 milestone Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: CI Updated CI scenario have been updated in the PR state: Documentation role Updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants