Skip to content

Commit 5b01f83

Browse files
WescoeurMarkSymsCtx
authored andcommitted
fix(cleanup): ensure VDI is active before relink
A VDI can have the `JRN_RELINK` tag which triggers a parent change, but this can fail with LVM driver if a host has been rebooted, in this situation the parent is not active. Before this fix: ``` Jan 2 16:01:16 xcp-host-1 SMGC: [3802739] Coalescing *85ff84a8[VHD](65.000G///33.836G|n) -> *cf4a78da[VHD](65.000G///52.555G|n) Jan 2 16:01:16 xcp-host-1 SMGC: [3802739] ==> Coalesce apparently already done: skipping Jan 2 16:01:16 xcp-host-1 SM: [3802739] lock: tried lock /var/lock/sm/1f74d512-a410-be6f-c816-8a1e43ea1801/sr, acquired: True (exists: True) Jan 2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for 568e74bb[VHD](65.000G///65.133G|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'} Jan 2 16:01:16 xcp-host-1 SMGC: [3802739] Set relinking = True for 568e74bb[VHD](65.000G///65.133G|n) Jan 2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for 568e74bb[VHD](65.000G///65.133G|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'} Jan 2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for dd9c344a[VHD](65.000G///8.000M|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'} Jan 2 16:01:16 xcp-host-1 SMGC: [3802739] Set relinking = True for dd9c344a[VHD](65.000G///8.000M|n) Jan 2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for dd9c344a[VHD](65.000G///8.000M|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'} Jan 2 16:01:16 xcp-host-1 SM: [3802739] LVMCache: refreshing Jan 2 16:01:16 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper acquired Jan 2 16:01:16 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper sent '3802739 - 20917096.949649855-' Jan 2 16:01:16 xcp-host-1 SM: [3802739] ['/sbin/lvs', '--noheadings', '--units', 'b', '-o', '+lv_tags', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801'] Jan 2 16:01:16 xcp-host-1 SM: [3802739] pread SUCCESS Jan 2 16:01:16 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper released Jan 2 16:01:16 xcp-host-1 SM: [3802739] ['/usr/bin/vhd-util', 'scan', '-f', '-m', 'VHD-*', '-l', 'VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801'] Jan 2 16:01:18 xcp-host-1 SM: [3802739] pread SUCCESS Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] SR 1f74 ('sr001-clu001-tdeaz-az07-svc-data23') (538 VDIs in 68 VHD trees): no changes Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] Relinking 568e74bb[VHD](65.000G///65.133G|n) from *85ff84a8[VHD](65.000G///33.836G|n) to *cf4a78da[VHD](65.000G///52.555G|n) Jan 2 16:01:18 xcp-host-1 SM: [3802739] lock: opening lock file /var/lock/sm/lvm-1f74d512-a410-be6f-c816-8a1e43ea1801/568e74bb-3eab-4e15-974a-a34a4bf7d779 Jan 2 16:01:18 xcp-host-1 SM: [3802739] lock: acquired /var/lock/sm/lvm-1f74d512-a410-be6f-c816-8a1e43ea1801/568e74bb-3eab-4e15-974a-a34a4bf7d779 Jan 2 16:01:18 xcp-host-1 SM: [3802739] Refcount for lvm-1f74d512-a410-be6f-c816-8a1e43ea1801:568e74bb-3eab-4e15-974a-a34a4bf7d779 (0, 0) + (1, 0) => (1, 0) Jan 2 16:01:18 xcp-host-1 SM: [3802739] Refcount for lvm-1f74d512-a410-be6f-c816-8a1e43ea1801:568e74bb-3eab-4e15-974a-a34a4bf7d779 set => (1, 0b) Jan 2 16:01:18 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper acquired Jan 2 16:01:18 xcp-host-1 SM: [3802739] ['/sbin/lvchange', '-ay', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-568e74bb-3eab-4e15-974a-a34a4bf7d779'] Jan 2 16:01:18 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper sent '3802739 - 20917099.141269115-' Jan 2 16:01:18 xcp-host-1 SM: [3802739] pread SUCCESS Jan 2 16:01:18 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper released Jan 2 16:01:18 xcp-host-1 SM: [3802739] lock: released /var/lock/sm/lvm-1f74d512-a410-be6f-c816-8a1e43ea1801/568e74bb-3eab-4e15-974a-a34a4bf7d779 Jan 2 16:01:18 xcp-host-1 SM: [3802739] ['/usr/bin/vhd-util', 'modify', '--debug', '-p', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-cf4a78da-316f-4fd9-a82d-07a47c142a39', '-n', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-568e74bb-3eab-4e15-974a-a34a4bf7d779'] Jan 2 16:01:18 xcp-host-1 SM: [3802739] FAILED in util.pread: (rc 2) stdout: 'failed to set parent to '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-cf4a78da-316f-4fd9-a82d-07a47c142a39': -2 Jan 2 16:01:18 xcp-host-1 SM: [3802739] ', stderr: '' Jan 2 16:01:18 xcp-host-1 SM: [3802739] lock: released /var/lock/sm/1f74d512-a410-be6f-c816-8a1e43ea1801/sr Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] *********************** Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] * E X C E P T I O N * Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] *********************** Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] coalesce: EXCEPTION <class 'util.CommandException'>, No such file or directory Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/cleanup.py", line 2024, in coalesce Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] self._coalesce(vdi) Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/cleanup.py", line 2228, in _coalesce Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] vdi._relinkSkip() Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/cleanup.py", line 963, in _relinkSkip Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] child._setParent(self.parent) Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/cleanup.py", line 1396, in _setParent Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] vhdutil.setParent(self.path, parent.path, parent.raw) Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/vhdutil.py", line 215, in setParent Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] ioretry(cmd) Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/vhdutil.py", line 94, in ioretry Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] errlist=[errno.EIO, errno.EAGAIN]) Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/util.py", line 347, in ioretry Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] return f() Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/vhdutil.py", line 93, in <lambda> Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] return util.ioretry(lambda: util.pread2(cmd, text=text), Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/util.py", line 255, in pread2 Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] return pread(cmdlist, quiet=quiet, text=text) Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] File "/opt/xensource/sm/util.py", line 217, in pread Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] raise CommandException(rc, str(cmdlist), stderr.strip()) Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* Jan 2 16:01:18 xcp-host-1 SMGC: [3802739] Coalesce failed, skipping ``` Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
1 parent ece3891 commit 5b01f83

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

libs/sm/cleanup.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1007,6 +1007,9 @@ def _setParent(self, parent):
10071007
Util.log("Failed to update %s with vhd-parent field %s" % \
10081008
(self.uuid, self.parentUuid))
10091009

1010+
def _ensureParentActiveForRelink(self):
1011+
pass
1012+
10101013
def isHidden(self):
10111014
if self._hidden is None:
10121015
self._loadInfoHidden()
@@ -1405,6 +1408,9 @@ def _activateChain(self):
14051408
def _deactivate(self):
14061409
self.sr.lvActivator.deactivate(self.uuid, False)
14071410

1411+
def _ensureParentActiveForRelink(self):
1412+
self.parent._activate()
1413+
14081414
def _increaseSizeVirt(self, size, atomic=True):
14091415
"ensure the virtual size of 'self' is at least 'size'"
14101416
self._activate()
@@ -1997,6 +2003,10 @@ def _coalesce(self, vdi):
19972003
# this means we had done the actual coalescing already and just
19982004
# need to finish relinking and/or refreshing the children
19992005
Util.log("==> Coalesce apparently already done: skipping")
2006+
2007+
# The parent volume must be active for the parent change to occur.
2008+
# The parent volume may become inactive if the host is rebooted.
2009+
vdi._ensureParentActiveForRelink()
20002010
else:
20012011
# JRN_COALESCE is used to check which VDI is being coalesced in
20022012
# order to decide whether to abort the coalesce. We remove the

0 commit comments

Comments
 (0)