Skip to content

feat: --append flag for easy CLI mTLS key rotation#36942

Open
BrageHK wants to merge 13 commits into
masterfrom
bragehk/rotate-mtls-cert
Open

feat: --append flag for easy CLI mTLS key rotation#36942
BrageHK wants to merge 13 commits into
masterfrom
bragehk/rotate-mtls-cert

Conversation

@BrageHK
Copy link
Copy Markdown
Contributor

@BrageHK BrageHK commented May 19, 2026

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

This change makes it easier to rotate credentials with --append flag, as the user does not have to manually create a credential for it to work, instead they can use Vespa CLI. Will update the documentation as well.

The --append flag does overwrite the current private key, but this should not be an issue as the new certificate is used anyways.

Copy link
Copy Markdown
Contributor

@esolitos esolitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice rotation flow — append-on-rotation + prune-on-cleanup maps well onto how operators actually do this. 👍

Most comments below are about making the API a bit more honest (the appendKey param doesn't actually append) and not swallowing filesystem errors. Two worth a real round-trip: the WritePrivateKeyFile param naming, and the implicit clients.pem overwrite during --prune.

Friendly reminder from the PR description: docs update still pending. 🙂


AI-assisted review · Findings and comments are verified and approved by a human reviewer before submission

Comment thread client/go/internal/cli/cmd/cert.go Outdated
Comment thread client/go/internal/vespa/crypto.go Outdated
// WritePrivateKeyFile writes the private key contained in this key pair to privateKeyFile.
func (kp *PemKeyPair) WritePrivateKeyFile(privateKeyFile string, overwrite bool) error {
if ioutil.Exists(privateKeyFile) && !overwrite {
func (kp *PemKeyPair) WritePrivateKeyFile(privateKeyFile string, overwrite bool, appendKey bool) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I find the appendKey parameter a bit confusing here. The function name and the symmetry with WriteCertificateFile(... appendCredentials) suggest it appends, but in practice it always overwrites — it only uses the flag to gate the "file must exist" precondition. The PR description acknowledges this too.

Two possible cleanups:

  • Rename the param so callers can't be misled, e.g. requireExisting bool, or
  • Drop the param entirely and have the caller in cert.go do the existence check + call WritePrivateKeyFile(path, true /*overwrite*/).

I'd lean toward option 2 — it keeps the crypto helper honest (it writes a private key, full stop) and pushes the rotation-specific logic up into doCert where the reader already has the context. WDYT?

Comment thread client/go/internal/vespa/crypto.go
Comment thread client/go/internal/vespa/crypto.go Outdated
Comment thread client/go/internal/vespa/crypto.go Outdated
}

// CleanCertificateFile keeps only the first (newest) PEM certificate in certificateFile, removing any older ones.
func CleanCertificateFile(certificateFile string) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙂 Tiny consistency thing: the user-facing flag is --prune, the cobra helper is doCleanCert, and the exported function is CleanCertificateFile. Could we settle on one term — I'd suggest Prune everywhere since that's what shows up in --help and what users will grep for? E.g. PruneCertificateFile, doPruneCert. Pure naming nit, no behavior change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. This was silly of me. I first used the name --clean for the flag, but then realized that both -c and -C are already existing flags. Forgot to update when i refactored, but fixed it now 😅

Comment thread client/go/internal/cli/cmd/cert.go Outdated
Comment thread client/go/internal/cli/cmd/cert_test.go Outdated
Comment on lines +194 to +195
// Use a fresh CLI with the same homeDir to avoid -A flag state bleeding into --prune
cli2, stdout2, _ := newTestCLI(t, "VESPA_CLI_HOME="+cli.config.homeDir)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment "avoid -A flag state bleeding into --prune" is interesting — would you mind expanding it just a little? My read is that cobra's BoolVarP binds to a closure-captured variable that persists across cli.Run calls within the same test process (so appendCertificate=true from the previous call is still set when --prune is parsed). Harmless in production (one process per CLI invocation), but a future maintainer is going to look at this and think "why the second CLI? this looks dumb, let me simplify" 😅 — a one-liner explaining why would save them.

Comment on lines +202 to 204
if !skipApplicationPackage {
return doCertAdd(cli, true, false, args)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 We confirm with the user before pruning their home-dir cert file, which is great — but then we also implicitly overwrite security/clients.pem in whatever app package the user happens to be in (via doCertAdd(cli, true, false, args)). That second write is silent.

Two options I can see:

  • Extend the warning text on the cli.printWarning line above to mention that this will also rewrite security/clients.pem in the current application package, so the confirmation covers both.
  • Or make the package update opt-in (let --no-add/-N do its usual job; pruning a local cert is independent from re-publishing the package).

Not blocking, but the current behavior surprised me on first read.

BrageHK and others added 9 commits May 19, 2026 15:52
Co-authored-by: Marlon (Esolitos) Saglia <marlon@vespa.ai>
Co-authored-by: Marlon (Esolitos) Saglia <marlon@vespa.ai>
Co-authored-by: Marlon (Esolitos) Saglia <marlon@vespa.ai>
Co-authored-by: Marlon (Esolitos) Saglia <marlon@vespa.ai>
Copy link
Copy Markdown
Contributor

@esolitos esolitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice round 2 — all prior comments addressed cleanly, and the MarkFlagsMutuallyExclusive move is honestly a better answer than what I'd suggested. 👍

One real concern left (the cert/key write ordering can leave a corrupted state if -A is used with a missing key file), plus a couple of tiny cleanups.


AI-assisted review · Findings and comments are verified and approved by a human reviewer before submission

Comment on lines +155 to 168
if err := keyPair.WriteCertificateFile(certificateFile.path, overwriteCertificate, appendCertificate); err != nil {
return fmt.Errorf("could not write certificate: %w", err)
}
if err := keyPair.WritePrivateKeyFile(privateKeyFile.path, overwriteCertificate); err != nil {
if appendCertificate {
if _, err := os.Stat(privateKeyFile.path); err != nil {
if os.IsNotExist(err) {
return fmt.Errorf("private key file does not exist: %s", privateKeyFile.path)
}
return fmt.Errorf("could not stat private key file %s: %w", privateKeyFile.path, err)
}
}
if err := keyPair.WritePrivateKeyFile(privateKeyFile.path, overwriteCertificate || appendCertificate); err != nil {
return fmt.Errorf("could not write private key: %w", err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I think there's a small partial-failure window here when --append is used:

  1. WriteCertificateFile runs first and (on -A) appends [newCert, oldCert] to the cert file.
  2. Then we os.Stat the key file.
  3. Then we write the private key.

If step 2 fails — e.g. the user has manually deleted data-plane-private-key.pem or hit a permission error — the cert file is already mutated, but the key file is untouched (so it still holds oldKey). tls.X509KeyPair will then fail on the next CLI call because newCert's public key won't match oldKey. Worse, re-running vespa auth cert -A to recover appends another cert on top, growing the file unboundedly.

The cert-existence precondition for -A is already checked up-front at line 146, so I think the cleanest fix is to move the key-existence check up there too and keep the writes back-to-back:

Suggested change
if err := keyPair.WriteCertificateFile(certificateFile.path, overwriteCertificate, appendCertificate); err != nil {
return fmt.Errorf("could not write certificate: %w", err)
}
if err := keyPair.WritePrivateKeyFile(privateKeyFile.path, overwriteCertificate); err != nil {
if appendCertificate {
if _, err := os.Stat(privateKeyFile.path); err != nil {
if os.IsNotExist(err) {
return fmt.Errorf("private key file does not exist: %s", privateKeyFile.path)
}
return fmt.Errorf("could not stat private key file %s: %w", privateKeyFile.path, err)
}
}
if err := keyPair.WritePrivateKeyFile(privateKeyFile.path, overwriteCertificate || appendCertificate); err != nil {
return fmt.Errorf("could not write private key: %w", err)
}
if appendCertificate && !ioutil.Exists(certificateFile.path) {
return errHint(fmt.Errorf("no certificate found at '%s'", color.CyanString(certificateFile.path)),
"Run 'vespa auth cert' first to create an initial certificate")
}
if appendCertificate {
if _, err := os.Stat(privateKeyFile.path); err != nil {
if os.IsNotExist(err) {
return fmt.Errorf("private key file does not exist: %s", privateKeyFile.path)
}
return fmt.Errorf("could not stat private key file %s: %w", privateKeyFile.path, err)
}
}
keyPair, err := vespa.CreateKeyPair()
if err != nil {
return err
}
if err := keyPair.WriteCertificateFile(certificateFile.path, overwriteCertificate, appendCertificate); err != nil {
return fmt.Errorf("could not write certificate: %w", err)
}
if err := keyPair.WritePrivateKeyFile(privateKeyFile.path, overwriteCertificate || appendCertificate); err != nil {
return fmt.Errorf("could not write private key: %w", err)
}

(suggestion is for the relative position — you'll want to delete the existing appendCertificate && !ioutil.Exists(...) block above to avoid the duplicate)

Comment on lines +80 to +81
cmd.MarkFlagsMutuallyExclusive("force", "append")
cmd.MarkFlagsMutuallyExclusive("prune", "force", "append")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the 3-way exclusion below already means "at most one of {prune, force, append}", so the 2-way ("force", "append") call is redundant. 🙂

Suggested change
cmd.MarkFlagsMutuallyExclusive("force", "append")
cmd.MarkFlagsMutuallyExclusive("prune", "force", "append")
cmd.MarkFlagsMutuallyExclusive("prune", "force", "append")

assert.Equal(t, 3, countPEMBlocks(t, certFile))
}

func TestCertCleanNoExisting(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: small consistency thing — the rename from Clean to Prune didn't reach the test names. TestCertClean, TestCertCleanNoExisting, TestCertCleanInvalidFlagForce and TestCertCleanInvalidFlagAppend would all read more clearly as TestCertPrune*. Not blocking. 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants