You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat(498): Add ownerReferences to managed entities
* empty owner reference for cross namespace secret and more tests
* update ownerReferences of existing resources
* removing ownerReference requires Update API call
* CR ownerReference on PVC blocks pvc retention policy of statefulset
* make ownerreferences optional and disabled by default
* update unit test to check len ownerReferences
* update codegen
* add owner references e2e test
* update unit test
* add block_owner_deletion field to test owner reference
* fix typos and update docs once more
* reflect code feedback
---------
Co-authored-by: Max Begenau <[email protected]>
Copy file name to clipboardExpand all lines: docs/administrator.md
+62-8
Original file line number
Diff line number
Diff line change
@@ -223,9 +223,9 @@ configuration:
223
223
224
224
Now, every cluster manifest must contain the configured annotation keys to
225
225
trigger the delete process when running `kubectl delete pg`. Note, that the
226
-
`Postgresql`resource would still get deleted as K8s' API server does not
227
-
block it. Only the operator logs will tell, that the delete criteria wasn't
228
-
met.
226
+
`Postgresql`resource would still get deleted because the operator does not
227
+
instruct K8s' API server to block it. Only the operator logs will tell, that
228
+
the delete criteria was not met.
229
229
230
230
**cluster manifest**
231
231
@@ -243,11 +243,65 @@ spec:
243
243
244
244
In case, the resource has been deleted accidentally or the annotations were
245
245
simply forgotten, it's safe to recreate the cluster with `kubectl create`.
246
-
Existing Postgres cluster are not replaced by the operator. But, as the
247
-
original cluster still exists the status will show `CreateFailed` at first.
248
-
On the next sync event it should change to `Running`. However, as it is in
249
-
fact a new resource for K8s, the UID will differ which can trigger a rolling
250
-
update of the pods because the UID is used as part of backup path to S3.
246
+
Existing Postgres cluster are not replaced by the operator. But, when the
247
+
original cluster still exists the status will be `CreateFailed` at first. On
248
+
the next sync event it should change to `Running`. However, because it is in
249
+
fact a new resource for K8s, the UID and therefore, the backup path to S3,
250
+
will differ and trigger a rolling update of the pods.
251
+
252
+
## Owner References and Finalizers
253
+
254
+
The Postgres Operator can set [owner references](https://kubernetes.io/docs/concepts/overview/working-with-objects/owners-dependents/) to most of a cluster's child resources to improve
255
+
monitoring with GitOps tools and enable cascading deletes. There are three
256
+
exceptions:
257
+
258
+
* Persistent Volume Claims, because they are handled by the [PV Reclaim Policy]https://kubernetes.io/docs/tasks/administer-cluster/change-pv-reclaim-policy/ of the Stateful Set
259
+
* The config endpoint + headless service resource because it is managed by Patroni
260
+
* Cross-namespace secrets, because owner references are not allowed across namespaces by design
261
+
262
+
The operator would clean these resources up with its regular delete loop
263
+
unless they got synced correctly. If for some reason the initial cluster sync
264
+
fails, e.g. after a cluster creation or operator restart, a deletion of the
265
+
cluster manifest would leave orphaned resources behind which the user has to
266
+
clean up manually.
267
+
268
+
Another option is to enable finalizers which first ensures the deletion of all
269
+
child resources before the cluster manifest gets removed. There is a trade-off
270
+
though: The deletion is only performed after the next two operator SYNC cycles
271
+
with the first one setting a `deletionTimestamp` and the latter reacting to it.
272
+
The final removal of the custom resource will add a DELETE event to the worker
273
+
queue but the child resources are already gone at this point. If you do not
274
+
desire this behavior consider enabling owner references instead.
275
+
276
+
**postgres-operator ConfigMap**
277
+
278
+
```yaml
279
+
apiVersion: v1
280
+
kind: ConfigMap
281
+
metadata:
282
+
name: postgres-operator
283
+
data:
284
+
enable_finalizers: "false"
285
+
enable_owner_references: "true"
286
+
```
287
+
288
+
**OperatorConfiguration**
289
+
290
+
```yaml
291
+
apiVersion: "acid.zalan.do/v1"
292
+
kind: OperatorConfiguration
293
+
metadata:
294
+
name: postgresql-operator-configuration
295
+
configuration:
296
+
kubernetes:
297
+
enable_finalizers: false
298
+
enable_owner_references: true
299
+
```
300
+
301
+
:warning: Please note, both options are disabled by default. When enabling owner
302
+
references the operator cannot block cascading deletes, even when the [delete protection annotations](administrator.md#delete-protection-via-annotations)
303
+
are in place. You would need an K8s admission controller that blocks the actual
304
+
`kubectl delete`API call e.g. based on existing annotations.
self.eventuallyEqual(lambda: k8s.get_operator_state(), {"0": "idle"}, "Operator does not get in sync")
1656
+
1657
+
time.sleep(5) # wait for the operator to sync the cluster and update resources
1658
+
1659
+
# check if child resources were updated with owner references
1660
+
self.assertTrue(self.check_cluster_child_resources_owner_references(cluster_name, self.test_namespace), "Owner references not set on all child resources of {}".format(cluster_name))
1661
+
self.assertTrue(self.check_cluster_child_resources_owner_references(default_test_cluster), "Owner references not set on all child resources of {}".format(default_test_cluster))
1662
+
1663
+
# delete the new cluster to test owner references
1664
+
# and also to make k8s_api.get_operator_state work better in subsequent tests
1665
+
# ideally we should delete the 'test' namespace here but the pods
1666
+
# inside the namespace stuck in the Terminating state making the test time out
# statefulset, pod disruption budget and secrets should be deleted via owner reference
1671
+
self.eventuallyEqual(lambda: k8s.count_pods_with_label(cluster_label), 0, "Pods not deleted")
1672
+
self.eventuallyEqual(lambda: k8s.count_statefulsets_with_label(cluster_label), 0, "Statefulset not deleted")
1673
+
self.eventuallyEqual(lambda: k8s.count_pdbs_with_label(cluster_label), 0, "Pod disruption budget not deleted")
1674
+
self.eventuallyEqual(lambda: k8s.count_secrets_with_label(cluster_label), 0, "Secrets were not deleted")
1675
+
1676
+
time.sleep(5) # wait for the operator to also delete the leftovers
1677
+
1678
+
# pvcs and Patroni config service/endpoint should not be affected by owner reference
1679
+
# but deleted by the operator almost immediately
1680
+
self.eventuallyEqual(lambda: k8s.count_pvcs_with_label(cluster_label), 0, "PVCs not deleted")
1681
+
self.eventuallyEqual(lambda: k8s.count_services_with_label(cluster_label), 0, "Patroni config service not deleted")
1682
+
self.eventuallyEqual(lambda: k8s.count_endpoints_with_label(cluster_label), 0, "Patroni config endpoint not deleted")
1683
+
1684
+
# disable owner references in config
1685
+
disable_owner_refs= {
1686
+
"data": {
1687
+
"enable_owner_references": "false"
1688
+
}
1689
+
}
1690
+
k8s.update_config(disable_owner_refs)
1691
+
self.eventuallyEqual(lambda: k8s.get_operator_state(), {"0": "idle"}, "Operator does not get in sync")
1692
+
1693
+
time.sleep(5) # wait for the operator to remove owner references
1694
+
1695
+
# check if child resources were updated without Postgresql owner references
1696
+
self.assertTrue(self.check_cluster_child_resources_owner_references(default_test_cluster, "default", True), "Owner references still present on some child resources of {}".format(default_test_cluster))
0 commit comments