Skip to content

CP-53446: Split plug and unplug atomics to enable live migration downtime reduction later #6362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

snwoods
Copy link
Contributor

@snwoods snwoods commented Mar 17, 2025

Split the plug and unplug atomics into separate attach/activate and deactivate/detach atomics. This is gated by two separate flags and both are switched off by default. When switched on, this should still be a no-op by calling them in serial where plug/unplug are currently used, but will allow us to later use these atomics separately elsewhere for optimisation.
Opening this as a draft PR as plug/unplug is fundamental so the logic needs to be watertight before we consider merging.

@psafont psafont changed the title Private/stevenwo/cp 53446 CP-53446: Split plug and unplug atomics into separate attach/activate and deactivate/detach Mar 17, 2025
@psafont
Copy link
Member

psafont commented Mar 17, 2025

It helps when reviewing to add the reason for the change into the PR:

VBD_{de,}activate are supposed to execute very quickly (<<100ms), while VBD_{de,}attach can take much longer (>100ms).
The different qualities of the calls can be used to optimise VM lifecycle operations like VM_migrate, VM_reboot etc:

  • VM_migrate: execute VBD_attach/detach outside the VM downtime section, and execute only VBD_activate/deactivate inside the VM downtime section.
  • VM_reboot (on the same host): only VBD_deactivate (to flush data to storage) and VBD_activate again, no need for VBD_detach + VBD_attach

This allows to reduce the amount of downtime in a live migration

@psafont psafont changed the title CP-53446: Split plug and unplug atomics into separate attach/activate and deactivate/detach CP-53446: Split plug and unplug atomics to reduce live migration downtime Mar 17, 2025
@snwoods snwoods changed the title CP-53446: Split plug and unplug atomics to reduce live migration downtime CP-53446: Split plug and unplug atomics to enable live migration downtime reduction later Mar 17, 2025
@snwoods snwoods force-pushed the private/stevenwo/CP-53446 branch 2 times, most recently from 65ec25e to 0926423 Compare March 17, 2025 16:20
@snwoods
Copy link
Contributor Author

snwoods commented Mar 17, 2025

I've added comments to make things clearer and cleaned some things up. I've also consolidated the two flags into one because I can't see a situation where we would need to split one and not the other.

@snwoods snwoods force-pushed the private/stevenwo/CP-53446 branch 5 times, most recently from 3a6d281 to 17bca64 Compare March 20, 2025 15:31
@snwoods snwoods force-pushed the private/stevenwo/CP-53446 branch from 17bca64 to faea8ea Compare March 21, 2025 11:22
Copy link
Member

@robhoes robhoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much clearer now 👍

@snwoods snwoods force-pushed the private/stevenwo/CP-53446 branch from faea8ea to 8906fbb Compare March 26, 2025 12:44
@psafont
Copy link
Member

psafont commented Mar 26, 2025

The changes seem good, but the commit messages are a bit too bare, could you add more context to them?

@snwoods snwoods force-pushed the private/stevenwo/CP-53446 branch from 8906fbb to 71f220e Compare March 28, 2025 10:53
@snwoods snwoods marked this pull request as ready for review March 28, 2025 11:18
snwoods added 3 commits April 3, 2025 15:22
This consists of two parts, splitting the attach_and_activate into
functionally equivalent attach and activate functions, and splitting the
VBD_plug atomic itself's behaviour into two new atomics, VBD_attach and
VBD_activate.

If the new xenopsd_vbd_plug_unplug_legacy flag is true, the only
difference will be that VBD_plug calls attach and activate sequentially
instead of attach_and_activate.

If xenopsd_vbd_plug_unplug_legacy is false, the VBD_attach and
VBD_activate atomics will be used in place of VBD_plug in all places
that it is used. This should still be functionally equivalent.

The purpose of this change is to allow optimisations in this area as
there are some situations where they do not need to be called at the
same time. For example VBD_attach could be called outside of VM migrate
downtime to reduce the overall downtime.

Signed-off-by: Steven Woods <[email protected]>
This consists of two parts, splitting the unplug function into
functionally equivalent deactivate and detach functions, and splitting
the VBD_unplug atomic itself's behaviour into two new atomics:
VBD_deactivate and VBD_detach

If the xenopsd_vbd_plug_unplug_legacy flag is true, the only difference
will be that VBD_unplug calls deactivate and detach sequentially instead
of unplug

If xenopsd_vbd_plug_unplug_legacy is false, the VBD_deactivate and
VBD_detach atomics will be used in place of VBD_unplug in all places
that it is used. This should still be functionally equivalent.

The purpose of this change is to allow optimisations in this area as
there are some situations where they do not need to be called at the
same time. For example we could skip detaching on reboot and only
deactivate and activate again reducing reboot time.

Signed-off-by: Steven Woods <[email protected]>
@snwoods snwoods force-pushed the private/stevenwo/CP-53446 branch from 71f220e to 1a46f33 Compare April 4, 2025 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants