Skip to content

Commit 391b83d

Browse files
authored
Updates to recovery.md to address PE-39730 (#526)
* Updates to recovery.md to address PE-39730 https://perforce.atlassian.net/browse/PE-39730 outlines some required changes to the procedure for replacing a missing or failed replica Puppet server. This draft aims to address all issues raised in the ticket. * Post-review updates to recovery.md Updates to draft following engineer review
1 parent 6bac026 commit 391b83d

File tree

1 file changed

+46
-15
lines changed

1 file changed

+46
-15
lines changed

documentation/recovery.md

Lines changed: 46 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -7,43 +7,72 @@ The new system needs to be provisioned with the same certificate name as the sys
77
## Recover from failed primary Puppet server
88

99
1. Promote the replica ([official docs](https://puppet.com/docs/pe/2019.8/dr_configure.html#dr-promote-replica))
10-
2. Replace missing replica server (same as [Replace missing or failed replica Puppet server](#replace-missing-or-failed-replica-puppet-server) below)
10+
2. Purge the failed primary server
11+
12+
puppet node purge <failed-primary-server-fqdn>
13+
14+
15+
3. Replace missing replica server (same as [Replace missing or failed replica Puppet server](#replace-missing-or-failed-replica-puppet-server) below)
1116

1217
## Replace missing or failed replica Puppet server
1318

1419
This procedure uses the following placeholder references.
1520

1621
* _\<primary-server-fqdn\>_ - The FQDN and certname of the primary Puppet server
17-
* _\<replacement-replica-fqdn\>_ - The FQDN and certname of the replacement replica Puppet server
18-
* _\<replacement-avail-group-letter\>_ - Either A or B; whichever of the two letter designations is appropriate for the server being replaced. It will be the opposite of the primary server.
22+
* _\<old-replica-fqdn\>_ - The FQDN and certname of the old replica Puppet server that has failed or is missing
23+
* _\<replacement-replica-fqdn\>_ - The FQDN and certname of the new replica Puppet server
24+
* _\<replacement-avail-group-letter\>_ - Either A or B; whichever of the two letter designations is appropriate for the replacement server. It will be the opposite of the primary server.
1925

2026
1. Ensure the old replica server is forgotten.
2127

22-
puppet infrastructure forget <replacement-replica-fqdn>
28+
puppet infrastructure forget <old-replica-fqdn>
2329

24-
2. Install the Puppet agent on the replacement replica
30+
2. Install the Puppet agent on the replacement replica.
2531

2632
curl -k https://<primary-server-fqdn>:8140/packages/current/install.bash \
2733
| bash -s -- \
2834
main:certname=<replacement-replica-fqdn> \
2935
extension_requests:1.3.6.1.4.1.34380.1.1.9812=puppet/server \
3036
extension_requests:1.3.6.1.4.1.34380.1.1.9813=<replacement-avail-group-letter>
3137

38+
source /ect/profile.d/puppet-agent.sh
39+
3240
puppet agent -t
3341

34-
3. On the PE-PostgreSQL server in the _\<replacement-avail-group-letter\>_ group
42+
3. Sign the certificate on the primary server.
43+
44+
puppetserver ca sign --certname
45+
46+
4. On the PE-PostgreSQL server in the _\<replacement-avail-group-letter\>_ group
3547
1. Stop puppet.service
36-
2. Add the following two lines to /opt/puppetlabs/server/data/postgresql/11/data/pg\_ident.conf
48+
49+
puppet resource service puppet ensure=stopped
50+
51+
3. Add the following two lines to /opt/puppetlabs/server/data/postgresql/_<postgres_version>_/data/pg_ident.conf
52+
53+
where _<postgres_version>_ is the appropriate major version of PostgreSQL as detailed in [Component versions in recent PE releases](https://www.puppet.com/docs/pe/2023.8/component_versions_in_recent_pe_releases.html#pe-agent-server-components). For PE release 2023.8.0 the PostgreSQL version is 14.
3754

3855
pe-puppetdb-pe-puppetdb-map <replacement-replica-fqdn> pe-puppetdb
3956
pe-puppetdb-pe-puppetdb-migrator-map <replacement-replica-fqdn> pe-puppetdb-migrator
4057

41-
3. Restart pe-postgresql.service
42-
3. Provision the new system as a replica
58+
59+
5. Restart pe-postgresql.service
60+
61+
puppet resource service pe-postgresql ensure=stopped
62+
puppet resource service pe-postgresql ensure=running
63+
64+
5. Run Puppet
65+
66+
puppet agent -t
67+
68+
5. Provision the new system as a replica
4369

4470
puppet infrastructure provision replica <replacement-replica-fqdn> --topology mono-with-compile --skip-agent-config --enable
4571

46-
4. On the PE-PostgreSQL server in the _\<replacement-avail-group-letter\>_ group, start puppet.service
72+
6. On the PE-PostgreSQL server in the _\<replacement-avail-group-letter\>_ group, start puppet.service
73+
74+
puppet resource service puppet ensure=running
75+
4776

4877
## Replace failed PE-PostgreSQL server (A or B side)
4978

@@ -102,11 +131,13 @@ On _\<working-postgres-server-fqdn\>_:
102131

103132
systemctl stop puppet
104133

105-
2. Add this line to /opt/puppetlabs/server/data/postgresql/11/data/pg\_ident.conf
134+
2. Add this line to /opt/puppetlabs/server/data/postgresql/_<postgres_version>_/data/pg_ident.conf
135+
136+
where _<postgres_version>_ is the appropriate major version of PostgreSQL as detailed in [Component versions in recent PE releases](https://www.puppet.com/docs/pe/2023.8/component_versions_in_recent_pe_releases.html#pe-agent-server-components). For PE release 2023.8.0 the PostgreSQL version is 14.
106137

107138
replication-pe-ha-replication-map <replacement-postgres-server-fqdn> pe-ha-replication
108139

109-
3. Add these lines to /opt/puppetlabs/server/data/postgresql/11/data/pg\_hba.conf
140+
3. Add these lines to /opt/puppetlabs/server/data/postgresql/_<postgres_version>_/data/pg\_hba.conf
110141

111142
# REPLICATION RESTORE PERMISSIONS
112143
hostssl replication pe-ha-replication 0.0.0.0/0 cert map=replication-pe-ha-replication-map clientcert=1
@@ -118,18 +149,18 @@ On _\<working-postgres-server-fqdn\>_:
118149

119150
On _\<replacement-postgres-server-fqdn\>_:
120151

121-
Run the following commands.
152+
Run the following commands (using the appropriate PostgreSQL version number)
122153

123154
```
124155
systemctl stop puppet.service pe-postgresql.service
125156
126-
mv /opt/puppetlabs/server/data/postgresql/11/data/certs /opt/puppetlabs/server/data/pg_certs
157+
mv /opt/puppetlabs/server/data/postgresql/14/data/certs /opt/puppetlabs/server/data/pg_certs
127158
128159
rm -rf /opt/puppetlabs/server/data/postgresql/*
129160
130161
runuser -u pe-postgres -- \
131162
/opt/puppetlabs/server/bin/pg_basebackup \
132-
-D /opt/puppetlabs/server/data/postgresql/11/data \
163+
-D /opt/puppetlabs/server/data/postgresql/14/data \
133164
-d "host=<working-postgres-server-fqdn>
134165
user=pe-ha-replication
135166
sslmode=verify-full

0 commit comments

Comments
 (0)