Skip to content

Commit f9e3dcb

Browse files
authored
DPE-2959 async replication (#352)
* poc k8s-k8s async repl * isort had mismatched line_length config * (WIP) improved support * (WIP) support more cases * scale up/down for repl clusters * idle state for 2nds * instance label when creating repl cluster * more typing * (wip) replica secondaries support * allow replica secondaries to rejoin * (WIP) sync cluster-set-name and better deal with repl 2ndaries * (WIP) fix secondaries join and messaging * fix: removed unneeded flag * support for relation broken * dissolved replica cluster stays blocked * added the promote standby action * support for cluster set name config * minor refactors and typing * async common methods, single module/file * removed dup action and lint fixes * check for user data on replica side * moved async_replication to a library (owned by vm charm) * sync with vm code * lint fixes * scale-in locks write to global primary * address pr comments * second batch of PR comments adressing * fence/unfence actions * fix for single unit replica cluster * workaround: secrets in relation data * partial unit test fixes * normalized cluster name for dict reference * remove test due to be refactored * allows rejoin after unrelate * fix lock instance reference * allow unrelated cluster rejoin * reset cluster-set name * dealing with secret not found * ensure secrets are shared with full uri * fix recreation and unrelation after promotion * ensure all online unis * unset read only after unfence * refactor remove instance to acommodate changing * fix race condition * automatic deal with clusters with the same name * bump * test for mysql version and cluster-set-name on rejoin * avoid handling on unit removal * sync local root * use method from lib * lint fixes * addressing pr feedback * using node mode instead of role * covering edge cases for recovery/failover * add rejoin invalidated cluster action * the integration test * workaround for issue #399 * bump libpatch * add group marks * missing default * set flag to avoid lock release * chore: fixes old issue on test * fix markers * fix marker import * fix retry process on removal * fix user creation after refactor `root@%` created only when required. * lint/grammar fixes * pr comment
1 parent 38ceda6 commit f9e3dcb

27 files changed

+2127
-394
lines changed

actions.yaml

+59-2
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,13 @@
22
# See LICENSE file for licensing details.
33

44
get-cluster-status:
5-
description: Get cluster status information without topology
5+
description: Get cluster status information
6+
params:
7+
cluster-set:
8+
type: boolean
9+
default: False
10+
description: Whether to fetch the cluster or cluster-set status.
11+
Possible values are False (default) or True.
612

713
get-password:
814
description: Fetch the system user's password, which is used by charm.
@@ -28,7 +34,7 @@ set-password:
2834
set-tls-private-key:
2935
description:
3036
Set the privates key, which will be used for certificate signing requests (CSR). Run
31-
for each unit separately.
37+
for each unit separately.
3238
params:
3339
internal-key:
3440
type: string
@@ -55,3 +61,54 @@ pre-upgrade-check:
5561

5662
resume-upgrade:
5763
description: Resume a rolling upgrade after asserting successful upgrade of a new revision.
64+
65+
promote-standby-cluster:
66+
description: |
67+
Promotes this cluster to become the leader in the cluster-set. Used for safe switchover or failover.
68+
Must be run against the charm leader unit of a standby cluster.
69+
params:
70+
cluster-set-name:
71+
type: string
72+
description: |
73+
The name of the cluster-set. Mandatory option, used for confirmation.
74+
force:
75+
type: boolean
76+
default: False
77+
description: |
78+
Use force when previous primary is unreachable (failover). Will invalidate previous
79+
primary.
80+
81+
recreate-cluster:
82+
description: |
83+
Recreates cluster on one or more standalone units that were previously part of a standby cluster.
84+
85+
When a standby cluster is removed from an async replication relation, the cluster will be dissolved and
86+
each unit will be kept in blocked status. Recreating the cluster allows to rejoin the async replication
87+
relation, or usage as a standalone cluster.
88+
89+
fence-writes:
90+
description: |
91+
Stops write traffic to a primary cluster of a ClusterSet.
92+
params:
93+
cluster-set-name:
94+
type: string
95+
description: |
96+
The name of the cluster-set. Mandatory option, used for confirmation.
97+
98+
unfence-writes:
99+
description: |
100+
Resumes write traffic to a primary cluster of a ClusterSet.
101+
params:
102+
cluster-set-name:
103+
type: string
104+
description: |
105+
The name of the cluster-set. Mandatory option, used for confirmation.
106+
107+
rejoin-cluster:
108+
description: |
109+
Rejoins an invalidated cluster to the cluster-set, after a previous failover or switchover.
110+
params:
111+
cluster-name:
112+
type: string
113+
description: |
114+
The name of the cluster to be rejoined.

config.yaml

+6-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,12 @@
33

44
options:
55
cluster-name:
6-
description: "Optional - Name of the MySQL InnoDB cluster"
6+
description: "Optional - Name of the MySQL InnoDB cluster, set once at deployment"
7+
type: "string"
8+
cluster-set-name:
9+
description: |
10+
Optional - Name for async replication cluster set, set once at deployment.
11+
On `recreate-clster` action call, the cluster set name will be re-generated automatically.
712
type: "string"
813
profile:
914
description: |

lib/charms/data_platform_libs/v0/data_secrets.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ def get_content(self) -> Dict[str, str]:
9797
"""Getting cached secret content."""
9898
if not self._secret_content:
9999
if self.meta:
100-
self._secret_content = self.meta.get_content()
100+
self._secret_content = self.meta.get_content(refresh=True)
101101
return self._secret_content
102102

103103
def set_content(self, content: Dict[str, str]) -> None:

0 commit comments

Comments
 (0)