You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: documentation/automated_recovery.md
+39-3
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,29 @@
2
2
3
3
These instructions provide automated procedures for recovering from select failures of PE components which are managed by PEADM.
4
4
5
-
Additional manual procedures are documented in [recovery.md](recovery.md)
5
+
Manual procedures are documented in [recovery.md](recovery.md)
6
+
7
+
## Recover from failed primary Puppet server
8
+
9
+
1. Promote the replica ([official docs](https://puppet.com/docs/pe/2019.8/dr_configure.html#dr-promote-replica))
10
+
2.[Replace missing or failed replica Puppet server](#replace-missing-or-failed-replica-puppet-server)
11
+
12
+
## Replace missing or failed replica Puppet server
13
+
14
+
This procedure uses the following placeholder references.
15
+
16
+
*_\<primary-server-fqdn\>_ - The FQDN and certname of the primary Puppet server
17
+
*_\<replica-postgres-server-fqdn\>_ - The FQDN and certname of the PE-PostgreSQL server which resides in the same availability group as the replacement replica Puppet server
18
+
*_\<replacement-replica-fqdn\>_ - The FQDN and certname of the replacement replica Puppet server
19
+
20
+
1. Run `peadm::add_replica` plan to deploy replacement replica Puppet server
21
+
1. For Standard and Large deployments
22
+
23
+
bolt plan run peadm::add_replica primary_host=<primary-server-fqdn> replica_host=<replacement-replica-fqdn>
24
+
25
+
2. For Extra Large deployments
26
+
27
+
bolt plan run peadm::add_replica primary_host=<primary-server-fqdn> replica_host=<replacement-replica-fqdn> replica_postgresql_host=<replica-postgres-server-fqdn>
6
28
7
29
## Replace failed PE-PostgreSQL server (A or B side)
8
30
@@ -22,7 +44,7 @@ Procedure:
22
44
23
45
2. Temporarily set both primary and replica server nodes so that they use the remaining healthy PE-PostgreSQL server
24
46
25
-
bolt plan run peadm::util::update_db_setting --target <primary-server-fqdn>,<replica-server-fqdn> primary_postgresql_host=<working-postgres-server-fqdn> override=true
47
+
bolt plan run peadm::util::update_db_setting --target <primary-server-fqdn>,<replica-server-fqdn> postgresql_host=<working-postgres-server-fqdn> override=true
26
48
27
49
3. Restart `pe-puppetdb.service` on Puppet server primary and replica
28
50
@@ -34,4 +56,18 @@ Procedure:
34
56
35
57
5. Run `peadm::add_database` plan to deploy replacement PE-PostgreSQL server
36
58
37
-
bolt plan run peadm::add_database -t <replacement-postgres-server-fqdn> primary_host=<primary-server-fqdn>
59
+
bolt plan run peadm::add_database -t <replacement-postgres-server-fqdn> primary_host=<primary-server-fqdn>
60
+
61
+
## Replace failed replica puppet server AND failed replica pe-postgresql server
62
+
63
+
This procedure uses the following placeholder references.
64
+
65
+
*_\<primary-server-fqdn\>_ - The FQDN and certname of the primary Puppet server
66
+
*_\<failed-replica-fqdn\>_ - The FQDN and certname of the failed replica Puppet server
67
+
68
+
1. Ensure the old replica server is forgotten.
69
+
70
+
bolt command run "/opt/puppetlabs/bin/puppet infrastructure forget <failed-replica-fqdn>" --targets <primary-server-fqdn>
71
+
72
+
2.[Replace failed PE-PostgreSQL server (A or B side)](#replace-failed-pe-postgresql-server-a-or-b-side)
73
+
3.[Replace missing or failed replica Puppet server](#replace-missing-or-failed-replica-puppet-server)
0 commit comments