-
Notifications
You must be signed in to change notification settings - Fork 251
OCPBUGS-57049: Update cert annotations #1971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-57049: Update cert annotations #1971
Conversation
|
@vrutkovs: This pull request references Jira Issue OCPBUGS-57049, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/hold This makes instead of Also |
94c3a48 to
5337ee8
Compare
|
/hold cancel |
d3b66d0 to
b8e4d7e
Compare
|
/approve |
|
/lgtm |
b8e4d7e to
3f5ff62
Compare
3f5ff62 to
56a5a40
Compare
pkg/operator/certrotation/target.go
Outdated
|
|
||
| annotations.NotBefore = certKeyPair.Certs[0].NotBefore.Format(time.RFC3339) | ||
| annotations.NotAfter = certKeyPair.Certs[0].NotAfter.Format(time.RFC3339) | ||
| annotations.RefreshPeriod = durationRound(refresh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it okay to store an inaccurate duration? Won’t this value be used for debugging purposes?
2537111 to
78719bd
Compare
| // NotAfter contains certificate the certificate validity date in RFC3339 format. | ||
| NotAfter string | ||
| // RefreshPeriod contains the interval at which the certificate should be refreshed. | ||
| RefreshPeriod string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what format will the date be stored? I think this might be useful for tools that process the raw TLS info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its using time.Duration format: 8h or 30d or 10y
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the duration will be changed by the durationRound function which formats a duration into a human-readable string so machines won't be able to process it (if needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where will this field be used ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its informational only. We don't limit or verify this that it indeed happens at specified time (however we do ensure that this certificate is being rotated at all with ShortCertRotation featuregate)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better if all dates were in the same format for consistency (NotBefore, NotAfter).
This way, the processing tools (the TLS script or cluster-debug-tools) would know what to expect and could reformat them into a more human-readable form if needed during output.
How hard would it be to move durationToString to the TLS registry script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a date, it's a duration. Its not the time when next refresh occurs, it would be notAfter+refreshTime.
We can store them as milliseconds, but it won't be human-friendly - and if there is a need for a tool which reads this annotation we might as well convert "5s" into milliseconds too. The format is no longer lossy anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or just store duration.String(), at least it would be accepted by https://pkg.go.dev/time#ParseDuration
it would be then possible to convert to a human-readable form on the receiving end, before displaying to end users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whereas 8y is not accepted by the time.PraseDuration - xref: https://go.dev/play/p/6qztJoDUlJJ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya, that would store 72h00m00s instead of 3d in annotations, this is why we need a custom Durations.ToBetterString()
| // NotAfter contains certificate the certificate validity date in RFC3339 format. | ||
| NotAfter string | ||
| // RefreshPeriod contains the interval at which the certificate should be refreshed. | ||
| RefreshPeriod string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also atm we store an inaccurate value in that field. Is that OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openshift/cluster-kube-apiserver-operator#1768 would set it for cluster-kube-apiserver-operator (see TLS artifacts in openshift/cluster-kube-apiserver-operator#1768):
{
"key": "certificates.openshift.io/refresh-period",
"value": "8y"
},
It also shows 6h refresh in 4.20 now, as we're using a faster refresh rotation in dev branch
|
See follow-up in #1981 |
LGTM let's rebase this pr and merge it, and then let's merge the follow-up pr |
208caca to
e6d67c1
Compare
|
lets /hold it for now, as it needs to be tested in CKAO PR first |
pkg/operator/certrotation/target.go
Outdated
| } | ||
|
|
||
| // setTargetCertKeyPairSecret creates a new cert/key pair and sets them in the secret. Only one of client, serving, or signer rotation may be specified. | ||
| // setTargetCertKeyPairSecretAndTLSAnnotations creates a new cert/key pair and sets them in the secret. Only one of client, serving, or signer rotation may be specified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's open a new pr for refactoring the target. i can do that. just let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, please do. I'll focus on e2e tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ptal #1982
e6d67c1 to
8b6b76a
Compare
Move testcase name out of auto-regenerate-after-offline-expiry, add refresh-period
8b6b76a to
cc2229c
Compare
pkg/operator/certrotation/signer.go
Outdated
| } | ||
| c.EventRecorder.Eventf("SignerUpdateRequired", "%q in %q requires a new signing cert/key pair: %v", c.Name, c.Namespace, reason) | ||
| c.AdditionalAnnotations.RefreshPeriod = c.Refresh.String() | ||
| if err = setSigningCertKeyPairSecretAndTLSAnnotations(signingCertKeyPairSecret, c.Validity, c.AdditionalAnnotations); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's pass c.Refresh to setSigningCertKeyPairSecretAndTLSAnnotations and then to setTLSAnnotationsOnSigningCertKeyPairSecret so that annotation assignments are in one place for the signer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, see #1971 (comment)
pkg/operator/certrotation/target.go
Outdated
| if reason := c.CertCreator.NeedNewTargetCertKeyPair(targetCertKeyPairSecret, signingCertKeyPair, caBundleCerts, c.Refresh, c.RefreshOnlyWhenExpired, creationRequired); len(reason) > 0 { | ||
| c.EventRecorder.Eventf("TargetUpdateRequired", "%q in %q requires a new target cert/key pair: %v", c.Name, c.Namespace, reason) | ||
| c.AdditionalAnnotations.RefreshPeriod = c.Refresh.String() | ||
| if err = setTargetCertKeyPairSecretAndTLSAnnotations(targetCertKeyPairSecret, c.Validity, signingCertKeyPair, c.CertCreator, c.AdditionalAnnotations); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would also pass c.Refresh to setTargetCertKeyPairSecretAndTLSAnnotations and then to setTLSAnnotationsOnTargetCertKeyPairSecret so that annotation assignments are in one place for the target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, see #1971 (comment)
| if len(a.RefreshPeriod) > 0 && meta.Annotations[CertificateRefreshPeriodAnnotation] != a.RefreshPeriod { | ||
| diff := cmp.Diff(meta.Annotations[CertificateRefreshPeriodAnnotation], a.RefreshPeriod) | ||
| klog.V(2).Infof("Updating %q annotation for %s/%s, diff: %s", CertificateRefreshPeriodAnnotation, meta.Name, meta.Namespace, diff) | ||
| meta.Annotations[CertificateRefreshPeriodAnnotation] = a.RefreshPeriod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need a unit test that checks assignment to the new annotation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we do - some certs are not being refreshed and if it's incorrectly set by the controller we'll see this in TLS registry violations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
pkg/operator/certrotation/signer.go
Outdated
| reason = "secret doesn't exist" | ||
| } | ||
| c.EventRecorder.Eventf("SignerUpdateRequired", "%q in %q requires a new signing cert/key pair: %v", c.Name, c.Namespace, reason) | ||
| c.AdditionalAnnotations.RefreshPeriod = c.Refresh.String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this annotation be also set in line 83 ?
if !c.RefreshOnlyWhenExpired {
needsMetadataUpdate := ensureOwnerRefAndTLSAnnotations(signingCertKeyPairSecret, c.Owner, c.AdditionalAnnotations)
needsTypeChange := ensureSecretTLSTypeSet(signingCertKeyPairSecret)
updateRequired = needsMetadataUpdate || needsTypeChange
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far, I'd like to set refresh period only when contents are being refreshed. Later on we'll need a new PR which backfills this on existing certificates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason we don’t want to set the refresh annotation for existing certificates? Wouldn’t it be easier to set it now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have at least three options:
- Update the callers to set the new annotation.
- Update the annotation at the beginning of the EnsureSigningCertKeyPair method.
- Modify the controller that calls EnsureSigningCertKeyPair to update the annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think option 2 is the easiest.
Setting the annotation at the beginning of the function ensures it is applied to both new and existing secrets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer a separate PR because this was not tested on existing PRs. If TRT comes and reverts because it breaks upgrades I'd rather have it already applied for new clusters at least.
I agree that 2. seems like the easiest option so far, but I didn't dig deep into this yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
cc2229c to
248e356
Compare
|
/hold cancel |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jsafrane, p0lyn0mial, sjenning, vrutkovs The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/refresh |
|
@vrutkovs: Jira Issue OCPBUGS-57049: Some pull requests linked via external trackers have merged: The following pull requests linked via external trackers have not merged: These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-57049 has not been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Move testcase name out of auto-regenerate-after-offline-expiry, add
refresh-period.
Follow-up for openshift/origin#29327
Tested in openshift/cluster-kube-apiserver-operator#1768 and openshift/cluster-authentication-operator#742