Skip to content

Commit 70ad6f7

Browse files
authored
PS-4625 [DOCS] - Add Clone SST method to PXC 8.0 (#212)
new file: docs/clone-sst.md modified: mkdocs-base.yml new file: snippets/tech-preview.md
1 parent 43003ed commit 70ad6f7

7 files changed

+259
-39
lines changed

docs/_static/clone-sst-process.png

300 KB
Loading

docs/clone-sst.md

+183
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
# State Snapshot Transfer (SST) Method using Clone plugin
2+
3+
--8<--- "tech-preview.md:4:4"
4+
5+
## SST Method: Clone
6+
7+
The Clone SST is a modern and efficient method that leverages MySQL's native cloning capabilities to transfer data from a donor node to a Joiner node. It is faster and more resource-efficient than traditional methods like xtrabackup or rsync.
8+
9+
## Limitations
10+
11+
Clone limitations are described in [Clone plugin limitations](https://docs.percona.com/percona-server/8.0/clone-plugin-limitations.html)
12+
13+
## Key features
14+
15+
| Feature | Description |
16+
|-------------------------|-------------------------------------------------------------|
17+
| Efficient data transfer | The clone plugin transfers data at the file level, reducing overhead. |
18+
| Consistency | Ensures data consistency between the donor and Joiner nodes. |
19+
| Minimal Downtime | Reduces the time required for node synchronization. |
20+
| Native Integration | Fully integrated into MySQL, eliminating the need for external tools. |
21+
22+
23+
## Prerequisites
24+
25+
The requirements for enabling SST transfers with the Clone plugin are as follows:
26+
27+
* Percona XtraDB Cluster (PXC) version 8.0.41 or later
28+
29+
* Sufficient disk space and network bandwidth for data transfer
30+
31+
* Properly configured PXC cluster with at least one donor node
32+
33+
* NetCat package installed
34+
35+
* The State Snapshot Transfer (SST) process uses port 4444 by default for data transfer between nodes when you use Percona Xtrabackup SST
36+
37+
## Best practices
38+
39+
The following best practices are for using SST with the Clone plugin:
40+
41+
| Recommendation | Description |
42+
|-------------------------------|--------------------------------------------------------------------------------------------------|
43+
| Choose a Suitable Donor | Select a Donor node with low load and sufficient resources to avoid performance degradation. |
44+
| Monitor Resources | Monitor CPU, memory, and disk usage during the SST process. |
45+
| Test in Staging | Test the SST process in a staging environment before deploying it in production. |
46+
47+
## Process outline
48+
49+
A high-level outline of the process:
50+
51+
<div style="text-align: center;">
52+
<img src="./_static/clone-sst-process.png" alt="Clone SST process" width="200" />
53+
</div>
54+
55+
## Enable the Clone SST Method
56+
57+
The Clone State Snapshot Transfer (SST) method in Percona XtraDB Cluster allows for efficient data synchronization between nodes. Proper configuration of this method ensures smooth and reliable cluster operations. This section explains how to enable the Clone SST method, including necessary variable settings and SSL configuration for secure data transfer.
58+
59+
### Donor and Joiner
60+
61+
To enable the `clone` SST method, ensure the [`wsrep_sst_allowed_methods`](wsrep-system-index.md#wsrep_sst_allowed_methods) variable in the configuration file (`my.cnf`) includes the `clone` method for both the Donor and Joiner servers. This setting is essential for a successful State Snapshot Transfer (SST).
62+
63+
Starting from Percona XtraDB Cluster 8.0.41, the default value of `wsrep_sst_allowed_methods` includes `clone`, which removes the need to configure this option manually in most cases.
64+
65+
```ini
66+
[mysqld]
67+
wsrep_sst_allowed_methods = xtrabackup-v2,clone
68+
```
69+
70+
### Joiner
71+
72+
On the Joiner server, set the [`wsrep_sst_method`]((wsrep-system-index.md#wsrep_sst_method)) variable to `clone` in the configuration file (`my.cnf`). This setting is the only accepted value for the Clone SST process.
73+
74+
```ini
75+
[mysqld]
76+
wsrep_sst_method = clone
77+
```
78+
79+
### Additional Information
80+
81+
The `wsrep_sst_allowed_methods` and `wsrep_sst_method` variables are read-only and cannot be modified at runtime. You must set them in the configuration file before starting the server. Attempting to change these variables while the server is running will result in errors and may cause inconsistencies during node synchronization operations.
82+
83+
For Percona XtraDB Cluster, any variables related to the SST mechanism, such as `wsrep_sst_allowed_methods` and `wsrep_sst_method`, must be defined before the server startup to ensure proper synchronization.
84+
85+
## Enable SSL for Clone SST
86+
87+
To enable SSL for the Clone SST process, place the SSL certificates in a directory other than the data directory, as the clone process modifies this directory.
88+
89+
We recommend explicitly setting the SSL certificates in the `my.cnf` file as follows:
90+
91+
```ini
92+
[client]
93+
ssl-ca = /<path>/ca.pem
94+
ssl-cert = /<path>/client-cert.pem
95+
ssl-key = /<path>/client-key.pem
96+
97+
[mysqld]
98+
ssl-ca = /<path>/ca.pem
99+
ssl-cert = /<path>/server-cert.pem
100+
ssl-key = /<path>/server-key.pem
101+
```
102+
103+
Alternatively, you can configure the following SSL settings specifically for the Clone SST process on the Joiner:
104+
105+
```ini
106+
[mysqld]
107+
clone_ssl_ca = /path/to/ca.pem
108+
clone_ssl_cert = /path/to/client-cert.pem
109+
clone_ssl_key = /path/to/client-key.pem
110+
```
111+
112+
Ensure the `<path>` used is not the data directory to avoid conflicts during the Clone SST process.
113+
114+
## Variables
115+
116+
### SST variables
117+
118+
State Snapshot Transfer (SST) in Galera Cluster relies on specific variables that control its configuration and behavior. You must set these variables appropriately to ensure seamless synchronization between nodes during the SST process. The following lists of the most commonly used variables and their purposes.
119+
120+
| Variable | Description | Link |
121+
|---------------------------------|---------------------------------------------------------------------------------------------------------------|-------------------------------------------|
122+
| `sst_idle_timeout` | Sets the maximum time (in seconds) the SST process can remain idle before being considered failed. You must define this variable in the `[sst]` section of the `my.cnf` file. | [Learn more](wsrep-system-index.md#sst_idle_timeout) |
123+
| `wsrep_sst_donor` | Defines the preferred donor node for SST. If not specified, the cluster automatically selects a donor. | [Learn more](wsrep-system-index.md#wsrep_sst_donor) |
124+
| `wsrep_sst_method` | Specifies the method or script used for the State Snapshot Transfer (SST) process. Only one value can be selected. | [Learn more](wsrep-system-index.md#wsrep_sst_method) |
125+
| `wsrep_sst_receive_address` | Specifies the IP address and port on the Joiner node to receive SST data. | [Learn more](wsrep-system-index.md#wsrep_sst_receive_address) |
126+
127+
128+
### Timeout variables
129+
130+
During the Clone SST process, there are three key moments when the Joiner or Donor must wait for the other to complete a specific action. These moments are governed by the following configurable timeout variables:
131+
132+
| Variable | Description |
133+
|-----------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
134+
| `joiner_timeout_wait_donor_message` | Determines how long (in seconds) the Joiner waits for the Donor to respond, indicating whether the action to perform is SST or IST. The default value is 60 seconds, which is usually sufficient. If the timeout is reached, the process aborts. |
135+
| `donor_timeout_wait_joiner` | Specifies how long (in seconds) the Donor waits for the Joiner to initialize the MySQL instance. The default value is 200 seconds. In slower systems, this value may need to be increased. The Donor log will provide a countdown and indicate if a timeout occurs. If the timeout is reached, the process aborts. |
136+
| `joiner_timeout_clone_instance` | Sets the time (in seconds) the Joiner waits to detect the MySQL instance where the Clone action will take place. The default value is 90 seconds. If the timeout is reached, the process aborts. |
137+
138+
These timeout variables can be configured in the `my.cnf` file as follows:
139+
140+
```ini
141+
[sst]
142+
joiner_timeout_wait_donor_message=60
143+
donor_timeout_wait_Joiner=200
144+
joiner_timeout_clone_instance=90
145+
```
146+
147+
## Debug the process
148+
149+
In the same context, if you must debug the process and need more information, you can enable the debug output in my.cnf:
150+
151+
```ini
152+
[sst]
153+
wsrep-debug=true
154+
```
155+
156+
The default port used by the SST process is 4444.
157+
158+
## Monitor the process
159+
160+
The Joiner log reports the clone process as a `% of the data transfer completed`.
161+
The query used is:
162+
163+
```{.bash data-prompt="mysql>"}
164+
mysql>SELECT FORMAT(((data/estimate)*100),2) 'completed%' FROM performance_schema.clone_progress WHERE stage LIKE 'FILE_COPY';```
165+
166+
For more information on the progress, you can also use the query:
167+
168+
`SELECT STATE, ERROR_NO, ERROR_MESSAGE FROM performance_schema.clone_status;`
169+
170+
## Troubleshoot
171+
172+
| Problem | Possible Cause | Solution |
173+
|---------|---------------|----------|
174+
| Clone operation fails | Network interruptions | Ensure stable network connection between nodes and sufficient bandwidth for data transfer |
175+
| Clone operation times out | Insufficient timeout values | Increase the timeout values in the `[sst]` section of my.cnf |
176+
| "No space left on device" error | Insufficient disk space | Verify that both donor and Joiner nodes have at least 1.5x the database size in free disk space |
177+
| Permission denied errors | Incorrect MySQL user privileges | Ensure the MySQL user has CLONE_ADMIN privileges on both nodes |
178+
| Connection refused on port 4444 | Firewall blocking traffic | Check firewall settings to allow traffic on port 4444 between cluster nodes |
179+
| Certificate validation failure | Incorrect SSL configuration | Verify SSL certificates are properly configured and accessible in non-data directories |
180+
| Clone plugin not found | Plugin not installed | Install the clone plugin using `INSTALL PLUGIN clone SONAME 'mysql_clone.so'` |
181+
| Data inconsistency after clone | Interrupted clone process | Check MySQL error logs and restart the clone process |
182+
183+

docs/state-snapshot-transfer.md

+37-15
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,41 @@
11
# State snapshot transfer
22

3-
State Snapshot Transfer (SST) is a full data copy from one node (donor)
4-
to the joining node (joiner).
5-
It’s used when a new node joins the cluster.
6-
In order to be synchronized with the cluster,
7-
the new node has to receive data from a node
8-
that is already part of the cluster.
3+
??? example "Key takeaways"
94

10-
Percona XtraDB Cluster enables via **xtrabackup**.
5+
Here are the top three takeaways from the "State Snapshot Transfer" documentation:
6+
7+
* **Purpose of SST**:
8+
State Snapshot Transfer (SST) is a critical process in Percona XtraDB Cluster that synchronizes a new or recovering node (joiner) with the cluster by transferring a consistent data snapshot from an existing node (donor).
9+
10+
* **Xtrabackup as the Recommended Method**:
11+
The default and most recommended SST method is `xtrabackup-v2`, which uses Percona XtraBackup. It is a non-blocking method that ensures the donor node remains operational during the transfer, leveraging backup locks for efficiency and consistency.
12+
13+
* **Configuration and Variables**:
14+
SST methods are configured using the `wsrep_sst_method` variable. Proper configuration of SST-related variables, such as `gcs.sync_donor`, is essential to avoid cluster-wide blocking and ensure smooth synchronization.
15+
16+
### Overview
1117

12-
Xtrabackup SST uses [backup locks](https://docs.percona.com/percona-server/8.0/backup-locks.html), which means the Galera provider is not paused at all as with earlier.
13-
The SST method can be configured using the [`wsrep_sst_method`](wsrep-system-index.md#wsrep_sst_method) variable.
18+
State Snapshot Transfer (SST) is essential in Percona XtraDB Cluster for synchronizing data between nodes when a new node joins the cluster or requires a full state transfer. The SST process ensures data consistency and cluster integrity during these operations.
19+
20+
The [**`xtrabackup` SST method**](xtrabackup-sst.md) is the recommended choice for most scenarios. It is a robust, reliable, and non-blocking method that allows the donor node to remain operational during the process. By creating consistent backups and efficiently transferring them to the joiner node, `xtrabackup` minimizes downtime and resource usage while maintaining data integrity.
21+
22+
The [**Clone SST method**](clone-sst.md) is another option for users seeking a modern and efficient approach. This method leverages MySQL’s native cloning capabilities to transfer data at the file level. While `xtrabackup` remains the primary choice, the Clone SST method can be particularly useful for scenarios where speed and simplicity are prioritized.
23+
24+
Choosing the appropriate SST method depends on your environment, requirements for performance, and resource considerations. Both options ensure consistency between nodes and support seamless cluster synchronization.
25+
26+
### Xtrabackup SST Method
27+
28+
Percona XtraDB Cluster supports the **xtrabackup** SST method, which is the recommended option for most use cases.
29+
30+
Xtrabackup SST uses [backup locks](https://docs.percona.com/percona-server/8.0/backup-locks.html), ensuring that the Galera provider remains operational without being paused, unlike earlier approaches. This method ensures a seamless data synchronization process during State Snapshot Transfers.
31+
32+
The SST method is configured using the [`wsrep_sst_method`](wsrep-system-index.md#wsrep_sst_method) variable.
33+
34+
!!! note
35+
36+
If the [`gcs.sync_donor`](wsrep-provider-index.md#gcs.sync_donor) variable is set to `Yes` (default is `No`), the entire cluster will be blocked if the donor node is blocked by SST.
1437

15-
!!! note
1638

17-
If the [`gcs.sync_donor`](wsrep-provider-index.md#gcs.sync_donor) variable is set to `Yes` (default is `No`), the whole cluster will get blocked if the donor is blocked by SST.
18-
1939
## Limitation
2040

2141
When configuring Percona XtraDB Cluster, your server must create a local socket. You can set up a socket by providing a path, or you can skip creating one explicitly. However, do not leave the <socket> variable in my.cnf empty, like this: `socket=`. If you do, the server won’t create a socket. State Snapshot Transfer (SST) requires the local socket for the following tasks:
@@ -64,8 +84,10 @@ the target directory does not exist, it will be created. If the target file
6484
already exists, an error will be returned, because XtraBackup cannot clear
6585
tablespaces not in the data directory.
6686

67-
## Other reading
87+
For more information, see:
88+
89+
* [Clone SST](clone-sst.md)
6890

69-
* [State Snapshot Transfer Methods for MySQL](https://galeracluster.com/library/documentation/sst.html)
91+
* [Xtrabackup SST configuration](xtrabackup-sst.md#xtrabackup-sst)
7092

71-
* [Xtrabackup SST configuration](xtrabackup-sst.md#xtrabackup-sst)
93+
* [State Snapshot Transfer Methods for MySQL](https://galeracluster.com/library/documentation/sst.html)

docs/wsrep-system-index.md

+12-14
Original file line numberDiff line numberDiff line change
@@ -1308,15 +1308,17 @@ Defines storage for streaming replication fragments. The available values are `t
13081308

13091309
| Option | Description |
13101310
| -------------- | ------------------ |
1311-
| Command Line: | ``--wsrep_sst_allowed_methods`` |
1312-
| Config File: | Yes |
1311+
| Command line: | ``--wsrep_sst_allowed_methods`` |
1312+
| Config file: | Yes |
13131313
| Scope: | Global |
13141314
| Dynamic: | No |
1315-
| Default Value: | ``xtrabackup-v2`` |
1315+
| Default value: | ``xtrabackup-v2, clone`` |
1316+
1317+
Percona XtraDB Cluster 8.0.41 includes `clone` to the default value. For older versions of Percona XtraDB Cluster, the default value is `xtrabackup-v2`.
13161318

13171319
Percona XtraDB Cluster 8.0.20-11.3 adds this variable.
13181320

1319-
This variable limits SST methods accepted by the server for [wsrep_sst_method](#wsrep_sst_method) variable. The default value is `xtrabackup-v2`.
1321+
This variable limits SST methods accepted by the server for [wsrep_sst_method](#wsrep_sst_method) variable.
13201322

13211323
### `wsrep_sst_donor`
13221324

@@ -1383,20 +1385,16 @@ Defines the method or script for [State Snapshot Transfer](state-snapshot-transf
13831385

13841386
Available values are:
13851387

1386-
* `xtrabackup-v2`: Uses *Percona XtraBackup* to perform SST. This value is the default.
1387-
Privileges and permissions for running *Percona XtraBackup*
1388-
can be found in [Percona XtraBackup documentation](https://docs.percona.com/percona-xtrabackup/8.0/privileges.html). For more information, see [Percona XtraBackup SST Configuration](xtrabackup-sst.md#xtrabackup-sst).
1388+
* * `xtrabackup-v2`: Uses Percona XtraBackup to perform SST. This is the default value.
1389+
Privileges and permissions required to run Percona XtraBackup are detailed in [Percona XtraBackup documentation](https://docs.percona.com/percona-xtrabackup/8.0/privileges.html). For additional details, see [Percona XtraBackup SST Configuration](xtrabackup-sst.md#xtrabackup-sst).
1390+
The `xtrabackup-v2` method supports clusters with GTIDs and async replicas.
13891391

1390-
* `skip`: Use this to skip SST.
1391-
**Removed in Percona XtraDB Cluster 8.0.33-25.** This value can be used when initially starting the cluster
1392-
and manually restoring the same data to all nodes.
1393-
This value should not be used permanently because it could lead to data inconsistency across the nodes.
1392+
* `clone`: Uses the [`clone`](clone-sst.md) method for SST, introduced in Percona XtraDB Cluster 8.0.41 and later versions.
13941393

1395-
* `ist_only` : **Introduced in Percona XtraDB Cluster 8.0.33-25.** This value allows only Incremental State Transfer (IST). If a node cannot sync with the cluster with IST, abort that node's start. This action leaves the data directory unchanged. This value prevents starting a node, after a manual backup restoration, that does not have a `grastate.dat` file. This missing file could initiate a full-state transfer (SST) which can be a more time and resource-intensive operation.
1394+
* `skip`: This value, which allows skipping SST, has been **removed as of Percona XtraDB Cluster 8.0.33-25.** It was previously used for initially starting the cluster and manually restoring the same data to all nodes. However, it is not suitable for permanent use, as it could cause data inconsistency across nodes.
13961395

1397-
!!! note
1396+
* `ist_only` : **Introduced in Percona XtraDB Cluster 8.0.33-25.** This value allows only Incremental State Transfer (IST). If a node cannot sync with the cluster with IST, abort that node's start. This action leaves the data directory unchanged. This value prevents starting a node, after a manual backup restoration, that does not have a `grastate.dat` file. This missing file could initiate a full-state transfer (SST) which can be a more time and resource-intensive operation.
13981397

1399-
``xtrabackup-v2`` provides support for clusters with GTIDs and async replicas.
14001398

14011399
!!! admonition "See also"
14021400

0 commit comments

Comments
 (0)