Skip to content

Commit ef85494

Browse files
authored
Updates to recovery.md to address PE-39730
https://perforce.atlassian.net/browse/PE-39730 outlines some required changes to the procedure for replacing a missing or failed replica Puppet server. This draft aims to address all issues raised in the ticket.
1 parent 6bac026 commit ef85494

File tree

1 file changed

+40
-14
lines changed

1 file changed

+40
-14
lines changed

documentation/recovery.md

+40-14
Original file line numberDiff line numberDiff line change
@@ -14,36 +14,62 @@ The new system needs to be provisioned with the same certificate name as the sys
1414
This procedure uses the following placeholder references.
1515

1616
* _\<primary-server-fqdn\>_ - The FQDN and certname of the primary Puppet server
17-
* _\<replacement-replica-fqdn\>_ - The FQDN and certname of the replacement replica Puppet server
18-
* _\<replacement-avail-group-letter\>_ - Either A or B; whichever of the two letter designations is appropriate for the server being replaced. It will be the opposite of the primary server.
17+
* _\<old-replica-fqdn\>_ - The FQDN and certname of the old replica Puppet server that has failed or is missing
18+
* _\<replacement-replica-fqdn\>_ - The FQDN and certname of the new replica Puppet server
19+
* _\<failed-primary-server-fqdn\>_ - The FQDN and certname of the original primary server that the old replica had replaced
20+
* _\<replacement-avail-group-letter\>_ - Either A or B; whichever of the two letter designations is appropriate for the replacement server. It will be the opposite of the server that it is replacing.
1921

20-
1. Ensure the old replica server is forgotten.
22+
1. If applicable, purge the failed primary server. (You may need to do this, for example, if the original primary failed and the promoted replica that replaced it has also failed.)
2123

22-
puppet infrastructure forget <replacement-replica-fqdn>
24+
puppet node purge <failed-primary-server-fqdn>
2325

24-
2. Install the Puppet agent on the replacement replica
26+
2. Ensure the old replica server is forgotten.
27+
28+
puppet infrastructure forget <old-replica-fqdn>
29+
30+
3. Install the Puppet agent on the replacement replica.
31+
32+
**Note**: When designating the availability group of the replacement, use the opposite group (A or B) of the server being replaced. This means that, if the old replica server replaced the original primary server, the new replica is assigned the same availability group as the original primary.
2533

2634
curl -k https://<primary-server-fqdn>:8140/packages/current/install.bash \
2735
| bash -s -- \
2836
main:certname=<replacement-replica-fqdn> \
2937
extension_requests:1.3.6.1.4.1.34380.1.1.9812=puppet/server \
3038
extension_requests:1.3.6.1.4.1.34380.1.1.9813=<replacement-avail-group-letter>
3139

40+
source /ect/profile.d/puppet-agent.sh
41+
3242
puppet agent -t
3343

34-
3. On the PE-PostgreSQL server in the _\<replacement-avail-group-letter\>_ group
44+
4. Sign the certificate on the new primary server.
45+
46+
5. On the PE-PostgreSQL server in the _\<replacement-avail-group-letter\>_ group
3547
1. Stop puppet.service
36-
2. Add the following two lines to /opt/puppetlabs/server/data/postgresql/11/data/pg\_ident.conf
48+
49+
puppet resource service puppet ensure=stopped
50+
51+
3. Add the following two lines to /opt/puppetlabs/server/data/postgresql/14/data/pg\_ident.conf
3752

3853
pe-puppetdb-pe-puppetdb-map <replacement-replica-fqdn> pe-puppetdb
3954
pe-puppetdb-pe-puppetdb-migrator-map <replacement-replica-fqdn> pe-puppetdb-migrator
4055

41-
3. Restart pe-postgresql.service
42-
3. Provision the new system as a replica
56+
5. Restart pe-postgresql.service
57+
58+
puppet resource service pe-postgresql ensure=stopped
59+
puppet resource service pe-postgresql ensure=running
60+
61+
5. Run Puppet
62+
63+
puppet agent -t
64+
65+
6. Provision the new system as a replica
4366

4467
puppet infrastructure provision replica <replacement-replica-fqdn> --topology mono-with-compile --skip-agent-config --enable
4568

46-
4. On the PE-PostgreSQL server in the _\<replacement-avail-group-letter\>_ group, start puppet.service
69+
7. On the PE-PostgreSQL server in the _\<replacement-avail-group-letter\>_ group, start puppet.service
70+
71+
puppet resource service puppet ensure=running
72+
4773

4874
## Replace failed PE-PostgreSQL server (A or B side)
4975

@@ -102,11 +128,11 @@ On _\<working-postgres-server-fqdn\>_:
102128

103129
systemctl stop puppet
104130

105-
2. Add this line to /opt/puppetlabs/server/data/postgresql/11/data/pg\_ident.conf
131+
2. Add this line to /opt/puppetlabs/server/data/postgresql/14/data/pg\_ident.conf
106132

107133
replication-pe-ha-replication-map <replacement-postgres-server-fqdn> pe-ha-replication
108134

109-
3. Add these lines to /opt/puppetlabs/server/data/postgresql/11/data/pg\_hba.conf
135+
3. Add these lines to /opt/puppetlabs/server/data/postgresql/14/data/pg\_hba.conf
110136

111137
# REPLICATION RESTORE PERMISSIONS
112138
hostssl replication pe-ha-replication 0.0.0.0/0 cert map=replication-pe-ha-replication-map clientcert=1
@@ -123,13 +149,13 @@ Run the following commands.
123149
```
124150
systemctl stop puppet.service pe-postgresql.service
125151
126-
mv /opt/puppetlabs/server/data/postgresql/11/data/certs /opt/puppetlabs/server/data/pg_certs
152+
mv /opt/puppetlabs/server/data/postgresql/14/data/certs /opt/puppetlabs/server/data/pg_certs
127153
128154
rm -rf /opt/puppetlabs/server/data/postgresql/*
129155
130156
runuser -u pe-postgres -- \
131157
/opt/puppetlabs/server/bin/pg_basebackup \
132-
-D /opt/puppetlabs/server/data/postgresql/11/data \
158+
-D /opt/puppetlabs/server/data/postgresql/14/data \
133159
-d "host=<working-postgres-server-fqdn>
134160
user=pe-ha-replication
135161
sslmode=verify-full

0 commit comments

Comments
 (0)