Skip to content

Fix classification when adding some components #258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 15, 2022

Conversation

ody
Copy link
Member

@ody ody commented May 6, 2022

Ensure classification is updated appropriately. Lacking classification results in plans only being able to replace components if they are of the same name but entirely unconfigured.

@ody ody force-pushed the use_failed_primary branch 14 times, most recently from a6806a7 to 9ea0cf5 Compare May 11, 2022 22:56
@ody ody changed the title (WIP) Deploy new replica hosts Fix classification of some added components May 12, 2022
@ody ody added the bugfix label May 12, 2022
@ody ody changed the title Fix classification of some added components Fix classification when adding some components May 12, 2022
@ody ody force-pushed the use_failed_primary branch 13 times, most recently from 9c9945c to 9085054 Compare May 13, 2022 17:27
@ody ody force-pushed the use_failed_primary branch 6 times, most recently from f6f5b7e to 828050f Compare May 20, 2022 20:39
ody added 12 commits May 31, 2022 20:50
Fixes the lack of classificaton in the add_replica plan so that it
does not fail when adding a replica to a deployment which was not
previously configured with one.

Without this fix, the plan could only replace failed replicas of the
same name.
Changes to add_replica which fix classificaton invalidate tests,
commit makes them valid again.
Previous to this, utility plan update_classification made unnecessary
assumptions about primaries and replicas. This commit ensure those
assumptions are not made and classification is based solely on
availability group letter.
The switch to availability group based classification necessitated
changes to add_database for it to continue working. Does a little clean
up of various cruft along the way.
It is not guaranteed to be in path
When reusing failed infrastructure components they may be configured for
a different primary then is current and have an old certificate
revacation list. Commit ensures that agent configuration is updated for
the current primary and fetches CRL from that primary.

Includes a little cleanup lifted from the add_compiler plan.
When running peadm::subplans::modify_certificate also get status of
certificate from the perspective of the primary to detect if the
certificate has been revoked.

Introduces new task, peadm::cert_valid_status which checks different
failure scenarios when validating certificates.
Acceptable failures when running clean on a primary expanded to address
scenarios where an infrastructure component is cleaned by another
process, e.g. puppet infrastructure forget
Creates a utility plan that is used by add_replica plan to source the
primary's global hiera configuration and distribute it to replicat
target.

Without this, data in the console is not available when compiling
catalogs after replica is promoted.
Capability to set PuppetDB database backend address to anything.
Previously, peadm::util::update_db_setting would always attempt to pair
configuration with appropriate availability group letter but in DR
scenarios this is not appropriate.
The addition of the peadm::cert_valid_status task triggered test suite
failures. Commit fixes them.
@ody ody force-pushed the use_failed_primary branch from 828050f to 8b678e4 Compare May 31, 2022 20:50
@ody
Copy link
Member Author

ody commented May 31, 2022

Ran through orchestrator. Bolt runs successfully until reaching the final step when it reported that it failed to connect to the rbac-api.

Starting: task peadm::provision_replica on pe-server-2b9722-0.us-west1-a.c.slice-cody.internal
Finished: task peadm::provision_replica with 1 failure in 74.6 sec
Finished: plan peadm::add_replica in 2 min, 53 sec
Failed on pe-server-2b9722-0.us-west1-a.c.slice-cody.internal:
  Could not connect to server with https://pe-server-2b9722-0.us-west1-a.c.slice-cody.internal:4433/rbac-api/v2/auth/token/authenticate
Failed on 1 target: pe-server-2b9722-0.us-west1-a.c.slice-cody.internal
Ran on 1 target

Orchestrator does continue running the final task though and it is ultimately successful, resulting in a functional fully provisioned Replica. Test was completed on the CLI via Bolt, using a token. I presume this is related to the puppet infrastructure enable replica command causing restarts of pe-puppetserver

@ody
Copy link
Member Author

ody commented Jun 8, 2022

Noticed today while testing some workflows that the add_database code does not take into consideration postgres 14

@mcka1n
Copy link
Contributor

mcka1n commented Jun 14, 2022

Hey @ody the logic looks good to me 👍

@ody ody marked this pull request as ready for review June 15, 2022 22:17
@ody ody requested a review from a team as a code owner June 15, 2022 22:17
@ody ody merged commit 33317df into puppetlabs:main Jun 15, 2022
@ody ody deleted the use_failed_primary branch June 15, 2022 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants