Skip to content

docs: add note about archive timeline reset on upgrade #1463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

bchrobot
Copy link

For example, a Postgres 12 cluster with backups saved for 7 days may have timelines 5-8. Upon major version upgrade, the timeline is reset to 1. After a few days there may now be timelines 1-4 (PG 13) and 5-8 (PG 12) in cloud storage. A recovery attempt without a specific timeline set via recovery_target_timeline will fail as the PG 12 timeline 5 does not follow the PG 13 timeline 4.

@CyberDem0n
Copy link
Contributor

Upon major version upgrade, the timeline is reset to 1

Yes, this is absolutely standard behavior. Major upgrade with pg_upgrade involves initializing the new PGDATA with initdb, which has timeline=1

A recovery attempt without a specific timeline set via recovery_target_timeline will fail as the PG 12 timeline 5 does not follow the PG 13 timeline 4.

Sorry, but this is not true. Backups for different major versions are written to different places in the bucket:
https://github.com/zalando/spilo/blob/c91248e26e2ea910304d04a3acbeda1e965e2e42/postgres-appliance/scripts/configure_spilo.py#L763

bucket_path = '/spilo/{WAL_BUCKET_SCOPE_PREFIX}{SCOPE}{WAL_BUCKET_SCOPE_SUFFIX}/wal/{PGVERSION}'.format(**wale)

I.e., for version 12 it would be /spilo/very-long-uid/my-cluster-name/wal/12 and for version 13 it would be /spilo/very-long-uid/my-cluster-name/wal/13.

When restoring from the backup you either have to specify the exact location where to restore from, or the backup-restore script will try to restore from all possible locations until it finds something:

/spilo/very-long-uid/my-cluster-name/wal/13
/spilo/very-long-uid/my-cluster-name/wal/12
/spilo/very-long-uid/my-cluster-name/wal/11
/spilo/very-long-uid/my-cluster-name/wal/10
/spilo/very-long-uid/my-cluster-name/wal/9.6
/spilo/very-long-uid/my-cluster-name/wal/9.5
/spilo/very-long-uid/my-cluster-name/wal

And once the suitable backup was found it will stick to this location.

If it started restoring the backup from version 13 there is no way it could jump back to 12.

@bchrobot
Copy link
Author

bchrobot commented Apr 23, 2021

Ah, reviewing the physical backups documentation added in #1367 I see that our pod config envvars are likely to blame

When we initially set up postgres-operator the only fully-worked example we could find and get to work was this:
https://www.redpill-linpro.com/techblog/2019/09/28/postgres-in-kubernetes.html#backup-configuration

which shows defining WALE_*_PREFIX, stripping the version and uid.

Thank you for the explanation @CyberDem0n !

@bchrobot bchrobot closed this Apr 23, 2021
@bchrobot bchrobot deleted the docs-major-version-upgrade branch April 23, 2021 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants