Skip to content

Commit 779f8f7

Browse files
authored
Sync master to feature/configure-ssh-phase2 (#6435)
merge master to feature branch
2 parents 34b308c + 12d5562 commit 779f8f7

File tree

81 files changed

+3148
-1574
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+3148
-1574
lines changed

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ DUNE_IU_PACKAGES1+=gzip http-lib pciutil sexpr stunnel uuid xml-light2 zstd xapi
155155
DUNE_IU_PACKAGES1+=message-switch message-switch-cli message-switch-core message-switch-lwt
156156
DUNE_IU_PACKAGES1+=message-switch-unix xapi-idl xapi-forkexecd xapi-storage xapi-storage-script xapi-storage-cli
157157
DUNE_IU_PACKAGES1+=xapi-nbd varstored-guard xapi-log xapi-open-uri xapi-tracing xapi-tracing-export xapi-expiry-alerts cohttp-posix
158-
DUNE_IU_PACKAGES1+=xapi-rrd xapi-inventory clock xapi-sdk
158+
DUNE_IU_PACKAGES1+=xapi-rrd xapi-inventory clock xapi-sdk tgroup
159159
DUNE_IU_PACKAGES1+=xapi-stdext-encodings xapi-stdext-pervasives xapi-stdext-std xapi-stdext-threads xapi-stdext-unix xapi-stdext-zerocheck xapi-tools
160160

161161

doc/content/xapi/storage/sxm.md

Lines changed: 222 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,220 @@
22
Title: Storage migration
33
---
44

5+
- [Overview](#overview)
6+
- [SXM Multiplexing](#sxm-multiplexing)
7+
- [Motivation](#motivation)
8+
- [But we have storage\_mux.ml](#but-we-have-storage_muxml)
9+
- [Thought experiments on an alternative design](#thought-experiments-on-an-alternative-design)
10+
- [Design](#design)
11+
- [SMAPIv1 migration](#smapiv1-migration)
12+
- [SMAPIv3 migration](#smapiv3-migration)
13+
- [Error Handling](#error-handling)
14+
- [Preparation (SMAPIv1 and SMAPIv3)](#preparation-smapiv1-and-smapiv3)
15+
- [Snapshot and mirror failure (SMAPIv1)](#snapshot-and-mirror-failure-smapiv1)
16+
- [Mirror failure (SMAPIv3)](#mirror-failure-smapiv3)
17+
- [Copy failure (SMAPIv1)](#copy-failure-smapiv1)
18+
- [SMAPIv1 Migration implementation detail](#smapiv1-migration-implementation-detail)
19+
- [Receiving SXM](#receiving-sxm)
20+
- [Xapi code](#xapi-code)
21+
- [Storage code](#storage-code)
22+
- [Copying a VDI](#copying-a-vdi)
23+
- [Mirroring a VDI](#mirroring-a-vdi)
24+
- [Code walkthrough](#code-walkthrough)
25+
- [DATA.copy](#datacopy)
26+
- [DATA.copy\_into](#datacopy_into)
27+
- [DATA.MIRROR.start](#datamirrorstart)
28+
29+
530
## Overview
631

7-
{{<mermaid align="left">}}
32+
The core idea of storage migration is surprisingly simple: We have VDIs attached to a VM,
33+
and we wish to migrate these VDIs from one SR to another. This necessarily requires
34+
us to copy the data stored in these VDIs over to the new SR, which can be a long-running
35+
process if there are gigabytes or even terabytes of them. We wish to minimise the
36+
down time of this process to allow the VM to keep running as much as possible.
37+
38+
At a very high level, the SXM process generally only consists of two stages: preparation
39+
and mirroring. The preparation is about getting the receiving host ready for the
40+
mirroring operation, while the mirroring itself can be further divided into two
41+
more operations: 1. sending new writes to both sides; 2.copying existing data from
42+
source to destination. The exact detail of how to set up a mirror differs significantly
43+
between SMAPIv1 and SMAPIv3, but both of them will have to perform the above two
44+
operations.
45+
Once the mirroring is established, it is a matter of checking the status of the
46+
mirroring and carry on with the follwoing VM migration.
47+
48+
The reality is more complex than what we had hoped for. For example, in SMAPIv1,
49+
the mirror establishment is quite an involved process and is itself divided into
50+
several stages, which will be discussed in more detail later on.
51+
52+
53+
## SXM Multiplexing
54+
55+
This section is about the design idea behind the additional layer of mutiplexing specifically
56+
for Storage Xen Motion (SXM) from SRs using SMAPIv3. It is recommended that you have read the
57+
[introduction doc](_index.md) for the storage layer first to understand how storage
58+
multiplexing is done between SMAPIv2 and SMAPI{v1, v3} before reading this.
59+
60+
61+
### Motivation
62+
63+
The existing SXM code was designed to work only with SMAPIv1 SRs, and therefore
64+
does not take into account the dramatic difference in the ways SXM is done between
65+
SMAPIv1 and SMAPIv3. The exact difference will be covered later on in this doc, for this section
66+
it is sufficient to assume that they have two ways of doing migration. Therefore,
67+
we need different code paths for migration from SMAPIv1 and SMAPIv3.
68+
69+
#### But we have storage_mux.ml
70+
71+
Indeed, storage_mux.ml is responsible for multiplexing and forwarding requests to
72+
the correct storage backend, based on the SR type that the caller specifies. And
73+
in fact, for inbound SXM to SMAPIv3 (i.e. migrating into a SMAPIv3 SR, GFS2 for example),
74+
storage_mux is doing the heavy lifting of multiplexing between different storage
75+
backends. Every time a `Remote.` call is invoked, this will go through the SMAPIv2
76+
layer to the remote host and get multiplexed on the destination host, based on
77+
whether we are migrating into a SMAPIv1 or SMAPIv3 SR (see the diagram below).
78+
And the inbound SXM is implemented
79+
by implementing the existing SMAPIv2 -> SMAPIv3 calls (see `import_activate` for example)
80+
which may not have been implemented before.
81+
82+
![mux for inbound](sxm_mux_inbound.svg)
83+
84+
While this works fine for inbound SXM, it does not work for outbound SXM. A typical SXM
85+
consists of four combinations, the source sr type (v1/v3) and the destiantion sr
86+
type (v1/v3), any of the four combinations is possible. We have already covered the
87+
destination multiplexing (v1/v3) by utilising storage_mux, and at this point we
88+
have run out of multiplexer for multiplexing on the source. In other words, we
89+
can only mutiplex once for each SMAPIv2 call, and we can either use that chance for
90+
either the source or the destination, and we have already used it for the latter.
91+
92+
93+
#### Thought experiments on an alternative design
94+
95+
To make it even more concrete, let us consider an example: the mirroring logic in
96+
SXM is different based on the source SR type of the SXM call. You might imagine
97+
defining a function like `MIRROR.start v3_sr v1_sr` that will be multiplexed
98+
by the storage_mux based on the source SR type, and forwarded to storage_smapiv3_migrate,
99+
or even just xapi-storage-script, which is indeed quite possible.
100+
Now at this point we have already done the multiplexing, but we still wish to
101+
multiplex operations on destination SRs, for example, we might want to attach a
102+
VDI belonging to a SMAPIv1 SR on the remote host. But as we have already done the
103+
multiplexing and is now inside xapi-storage-script, we have lost any chance of doing
104+
any further multiplexing :(
105+
106+
### Design
107+
108+
The idea of this new design is to introduce an additional multiplexing layer that
109+
is specific for multiplexing calls based on the source SR type. For example, in
110+
the diagram below the `send_start src_sr dest_sr` will take both the src SR and the
111+
destination SR as parameters, and suppose the mirroring logic is different for different
112+
types of source SRs (i.e. SMAPIv1 or SMAPIv3), the storage migration code will
113+
necessarily choose the right code path based on the source SR type. And this is
114+
exactly what is done in this additional multiplexing layer. The respective logic
115+
for doing {v1,v3}-specifi mirroring, for example, will stay in storage_smapi{v1,v3}_migrate.ml
116+
117+
![mux for outbound](sxm_mux_outbound.svg)
118+
119+
Note that later on storage_smapi{v1,v3}_migrate.ml will still have the flexibility
120+
to call remote SMAPIv2 functions, such as `Remote.VDI.attach dest_sr vdi`, and
121+
it will be handled just as before.
122+
123+
## SMAPIv1 migration
124+
125+
At a high level, mirror establishment for SMAPIv1 works as follows:
126+
127+
1. Take a snapshot of a VDI that is attached to VM1. This gives us an immutable
128+
copy of the current state of the VDI, with all the data until the point we took
129+
the snapshot. This is illustrated in the diagram as a VDI and its snapshot connecting
130+
to a shared parent, which stores the shared content for the snapshot and the writable
131+
VDI from which we took the snapshot (snapshot)
132+
2. Mirror the writable VDI to the server hosts: this means that all writes that goes to the
133+
client VDI will also be written to the mirrored VDI on the remote host (mirror)
134+
3. Copy the immutable snapshot from our local host to the remote (copy)
135+
4. Compose the mirror and the snapshot to form a single VDI
136+
5. Destroy the snapshot on the local host (cleanup)
137+
138+
139+
more detail to come...
140+
141+
## SMAPIv3 migration
142+
143+
More detail to come...
144+
145+
## Error Handling
146+
147+
Storage migration is a long-running process, and is prone to failures in each
148+
step. Hence it is important specifying what errors could be raised at each step
149+
and their significance. This is beneficial both for the user and for triaging.
150+
151+
There are two general cleanup functions in SXM: `MIRROR.receive_cancel` and
152+
`MIRROR.stop`. The former is for cleaning up whatever has been created by `MIRROR.receive_start`
153+
on the destination host (such as VDIs for receiving mirrored data). The latter is
154+
a more comprehensive function that attempts to "undo" all the side effects that
155+
was done during the SXM, and also calls `receive_cancel` as part of its operations.
156+
157+
Currently error handling was done by building up a list of cleanup functions in
158+
the `on_fail` list ref as the function executes. For example, if the `receive_start`
159+
has been completed successfully, add `receive_cancel` to the list of cleanup functions.
160+
And whenever an exception is encountered, just execute whatever has been added
161+
to the `on_fail` list ref. This is convenient, but does entangle all the error
162+
handling logic with the core SXM logic itself, making the code rather than hard
163+
to understand and maintain.
164+
165+
The idea to fix this is to introduce explicit "stages" during the SXM and define
166+
explicitly what error handling should be done if it fails at a certain stage. This
167+
helps separate the error handling logic into the `with` part of a `try with` block,
168+
which is where they are supposed to be. Since we need to accommodate the existing
169+
SMAPIv1 migration (which has more stages than SMAPIv3), the following stages are
170+
introduced: preparation (v1,v3), snapshot(v1), mirror(v1, v3), copy(v1). Note that
171+
each stage also roughly corresponds to a helper function that is called within `MIRROR.start`,
172+
which is the wrapper function that initiates storage migration. And each helper
173+
functions themselves would also have error handling logic within themselves as
174+
needed (e.g. see `Storage_smapiv1_migrate.receive_start) to deal with exceptions
175+
that happen within each helper functions.
176+
177+
### Preparation (SMAPIv1 and SMAPIv3)
178+
179+
The preparation stage generally corresponds to what is done in `receive_start`, and
180+
this function itself will handle exceptions when there are partial failures within
181+
the function itself, such as an exception after the receiving VDI is created.
182+
It will use the old-style `on_fail` function but only with a limited scope.
183+
184+
There is nothing to be done at a higher level (i.e within `MIRROR.start` which
185+
calls `receive_start`) if preparation has failed.
186+
187+
### Snapshot and mirror failure (SMAPIv1)
188+
189+
For SMAPIv1, the mirror is done in a bit cumbersome way. The end goal is to establish
190+
connections between two tapdisk processes on the source and destination hosts.
191+
To achieve this goal, xapi will do two main jobs: 1. create a connection between two
192+
hosts and pass the connection to tapdisk; 2. create a snapshot as a starting point
193+
of the mirroring process.
194+
195+
Therefore handling of failures at these two stages are similar: clean up what was
196+
done in the preparation stage by calling `receive_cancel`, and that is almost it.
197+
Again, we will leave whatever is needed for partial failure handling within those
198+
functions themselves and only clean up at a stage-level in `storage_migrate.ml`
199+
200+
Note that `receive_cancel` is a multiplexed function for SMAPIv1 and SMAPIv3, which
201+
means different clean up logic will be executed depending on what type of SR we
202+
are migrating from.
203+
204+
### Mirror failure (SMAPIv3)
205+
206+
To be filled...
207+
208+
### Copy failure (SMAPIv1)
209+
210+
The final step of storage migration for SMAPIv1 is to copy the snapshot from the
211+
source to the destination. At this stage, most of the side effectful work has been
212+
done, so we do need to call `MIRROR.stop` to clean things up if we experience an
213+
failure during copying.
214+
215+
216+
## SMAPIv1 Migration implementation detail
217+
218+
```mermaid
8219
sequenceDiagram
9220
participant local_tapdisk as local tapdisk
10221
participant local_smapiv2 as local SMAPIv2
@@ -129,7 +340,7 @@ opt post_detach_hook
129340
end
130341
Note over xapi: memory image migration by xenopsd
131342
Note over xapi: destroy the VM record
132-
{{< /mermaid >}}
343+
```
133344

134345
### Receiving SXM
135346

@@ -162,7 +373,7 @@ the receiving end of storage motion:
162373

163374
This is how xapi coordinates storage migration. We'll do it as a code walkthrough through the two layers: xapi and storage-in-xapi (SMAPIv2).
164375

165-
## Xapi code
376+
### Xapi code
166377

167378
The entry point is in [xapi_vm_migration.ml](https://github.com/xapi-project/xen-api/blob/f75d51e7a3eff89d952330ec1a739df85a2895e2/ocaml/xapi/xapi_vm_migrate.ml#L786)
168379

@@ -1056,7 +1267,7 @@ We also try to remove the VM record from the destination if we managed to send i
10561267
Finally we check for mirror failure in the task - this is set by the events thread watching for events from the storage layer, in [storage_access.ml](https://github.com/xapi-project/xen-api/blob/f75d51e7a3eff89d952330ec1a739df85a2895e2/ocaml/xapi/storage_access.ml#L1169-L1207)
10571268

10581269

1059-
## Storage code
1270+
### Storage code
10601271

10611272
The part of the code that is conceptually in the storage layer, but physically in xapi, is located in
10621273
[storage_migrate.ml](https://github.com/xapi-project/xen-api/blob/f75d51e7a3eff89d952330ec1a739df85a2895e2/ocaml/xapi/storage_migrate.ml). There are logically a few separate parts to this file:
@@ -1069,7 +1280,7 @@ The part of the code that is conceptually in the storage layer, but physically i
10691280

10701281
Let's start by considering the way the storage APIs are intended to be used.
10711282

1072-
### Copying a VDI
1283+
#### Copying a VDI
10731284

10741285
`DATA.copy` takes several parameters:
10751286

@@ -1119,7 +1330,7 @@ The implementation uses the `url` parameter to make SMAPIv2 calls to the destina
11191330
The implementation tries to minimize the amount of data copied by looking for related VDIs on the destination SR. See below for more details.
11201331

11211332

1122-
### Mirroring a VDI
1333+
#### Mirroring a VDI
11231334

11241335
`DATA.MIRROR.start` takes a similar set of parameters to that of copy:
11251336

@@ -1156,11 +1367,11 @@ Note that state is a list since the initial phase of the operation requires both
11561367

11571368
Additionally the mirror can be cancelled using the `MIRROR.stop` API call.
11581369

1159-
### Code walkthrough
1370+
#### Code walkthrough
11601371

11611372
let's go through the implementation of `copy`:
11621373

1163-
#### DATA.copy
1374+
##### DATA.copy
11641375

11651376
```ocaml
11661377
let copy ~task ~dbg ~sr ~vdi ~dp ~url ~dest =
@@ -1296,7 +1507,7 @@ Finally we snapshot the remote VDI to ensure we've got a VDI of type 'snapshot'
12961507

12971508
The exception handler does nothing - so we leak remote VDIs if the exception happens after we've done our cloning :-(
12981509

1299-
#### DATA.copy_into
1510+
##### DATA.copy_into
13001511

13011512
Let's now look at the data-copying part. This is common code shared between `VDI.copy`, `VDI.copy_into` and `MIRROR.start` and hence has some duplication of the calls made above.
13021513

@@ -1467,7 +1678,7 @@ The last thing we do is to set the local and remote content_id. The local set_co
14671678
Here we perform the list of cleanup operations. Theoretically. It seems we don't ever actually set this to anything, so this is dead code.
14681679

14691680

1470-
#### DATA.MIRROR.start
1681+
##### DATA.MIRROR.start
14711682

14721683
```ocaml
14731684
let start' ~task ~dbg ~sr ~vdi ~dp ~url ~dest =
@@ -1765,3 +1976,4 @@ let pre_deactivate_hook ~dbg ~dp ~sr ~vdi =
17651976
s.failed <- true
17661977
)
17671978
```
1979+

doc/content/xapi/storage/sxm_mux_inbound.svg

Lines changed: 4 additions & 0 deletions
Loading

doc/content/xapi/storage/sxm_mux_outbound.svg

Lines changed: 4 additions & 0 deletions
Loading

dune-project

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,17 @@
146146

147147
(package
148148
(name xapi-storage-cli)
149+
(depends
150+
cmdliner
151+
re
152+
rpclib
153+
ppx_deriving_rpc
154+
(xapi-client (= :version))
155+
(xapi-idl (= :version))
156+
(xapi-types (= :version))
157+
)
158+
(synopsis "A CLI for xapi storage services")
159+
(description "The CLI allows you to directly manipulate virtual disk images, without them being attached to VMs.")
149160
)
150161

151162
(package
@@ -711,12 +722,14 @@ This package provides an Lwt compatible interface to the library.")
711722
(synopsis "Xapi's standard library extension, Threads")
712723
(authors "Jonathan Ludlam")
713724
(depends
725+
ambient-context
714726
base-threads
715727
base-unix
716728
(alcotest :with-test)
717729
(clock (= :version))
718730
(fmt :with-test)
719731
mtime
732+
tgroup
720733
(xapi-log (= :version))
721734
(xapi-stdext-pervasives (= :version))
722735
(xapi-stdext-unix (= :version))

ocaml/gencert/gencert.ml

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -47,22 +47,20 @@ let main ~dbg ~path ~cert_gid ~sni () =
4747
init_inventory () ;
4848
let generator path =
4949
match sni with
50-
| SNI.Default ->
51-
let name, ip =
52-
match Networking_info.get_management_ip_addr ~dbg with
53-
| None ->
54-
D.error "gencert.ml: cannot get management ip address!" ;
55-
exit 1
56-
| Some x ->
57-
x
58-
in
59-
let dns_names = Networking_info.dns_names () in
60-
let ips = [ip] in
61-
let (_ : X509.Certificate.t) =
62-
Gencertlib.Selfcert.host ~name ~dns_names ~ips ~valid_for_days path
63-
cert_gid
64-
in
65-
()
50+
| SNI.Default -> (
51+
match Networking_info.get_host_certificate_subjects ~dbg with
52+
| Error cause ->
53+
let msg = Networking_info.management_ip_error_to_string cause in
54+
D.error
55+
"gencert.ml: failed to generate certificate subjects because %s" msg ;
56+
exit 1
57+
| Ok (name, dns_names, ips) ->
58+
let _ : X509.Certificate.t =
59+
Gencertlib.Selfcert.host ~name ~dns_names ~ips ~valid_for_days path
60+
cert_gid
61+
in
62+
()
63+
)
6664
| SNI.Xapi_pool ->
6765
let uuid = Inventory.lookup Inventory._installation_uuid in
6866
let (_ : X509.Certificate.t) =

0 commit comments

Comments
 (0)