Skip to content

📖 Add CRD Upgrade Safety documentation #1090

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 8, 2024

Conversation

trgeiger
Copy link
Contributor

@trgeiger trgeiger commented Aug 1, 2024

Description

Closes #746

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

@trgeiger trgeiger requested a review from a team as a code owner August 1, 2024 16:04
Copy link

netlify bot commented Aug 1, 2024

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit bb019fd
🔍 Latest deploy log https://app.netlify.com/sites/olmv1/deploys/66b4d7c9aac58400080e7cf4
😎 Deploy Preview https://deploy-preview-1090--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codecov bot commented Aug 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.23%. Comparing base (2554b83) to head (bb019fd).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1090      +/-   ##
==========================================
- Coverage   75.31%   75.23%   -0.09%     
==========================================
  Files          31       35       +4     
  Lines        1904     1914      +10     
==========================================
+ Hits         1434     1440       +6     
- Misses        327      331       +4     
  Partials      143      143              
Flag Coverage Δ
e2e 57.36% <ø> (-0.25%) ⬇️
unit 50.73% <ø> (+1.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


The following changes to an existing CRD are safe for backwards compatibility and will
not cause the CRD Upgrade Safety preflight check to halt the upgrade:
- Adding new enum values to the list of allowed enum values in a field
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to leave in for now because it is not considered a breaking change in our validations, but something we identified is that Kubernetes API conventions state that adding new enum values is actually considered a breaking change.

- An existing required field is changed to optional in an existing version
- The minimum value of an existing field is decreased in an existing version
- The maximum value of an existing field is increased in an existing version
- A new version of the CRD is added with no modifications to existing versions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed in this PR, but something that I think we may have overlooked when originally developing the preflight checks is that new versions of the CRD must be backwards compatible with the served versions OR provide conversion logic to prevent data loss and invalidation of resources stored at an older version.

@joelanford I think we may need to revisit the CRD Upgrade Safety preflight checks to ensure we capture behavior to check for breaking changes between all stored and served versions of the CRD when there is no conversion strategy specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the long discussion this birthed in Slack, do we want to hold off on these docs until that work is done or would we just update this once that's complete? Not sure on what kind of timeline you were imagining all this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can let the doc through and update it as necessary

versions:
- name: v1alpha1
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include what the error message looks like when a change like this is caught?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@michaelryanpeter michaelryanpeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work with very clear writing. I left a few suggestions. I noticed that this new page is not showing up in the Netlify preview.

Comment on lines +14 to +24
- The scope changes from Cluster to Namespace or from Namespace to Cluster
- An existing stored version of the CRD is removed
- A new required field is added to an existing version of the CRD
- An existing field is removed from an existing version of the CRD
- An existing field type is changed in an existing version of the CRD
- A new default value is added to a field that did not previously have a default value
- The default value of a field is changed
- An existing default value of a field is removed
- New enum restrictions are added to an existing field which did not previously have enum restrictions
- Existing enum values from an existing field are removed
- The minimum value of an existing field is increased in an existing version
- The maximum value of an existing field is decreased in an existing version
- Minimum or maximum field constraints are added to a field that did not previously have constraints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if in a future enhancement, it would be worth refactoring this content into a table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be nice. I spent a few minutes trying to think of how exactly that would be structured and couldn't really come up with a good idea. Maybe columns for safe/unsafe changes or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have played around with it too, but I haven't been able to beat your list. I also haven't been able to figure out a good information architecture.

I do feel like like it would be nice to have this be a bit more scan-able as a reference. Maybe we will come up with something when we downstream this doc?

versions:
- name: v1alpha1
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

versions:
- name: v1alpha1
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trgeiger trgeiger force-pushed the crd-update-docs branch 3 times, most recently from 8416d49 to d732bdd Compare August 2, 2024 18:46
Copy link
Contributor

@michaelryanpeter michaelryanpeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nits to fix the rendering on the unordered lists in the preview. Otherwise, LGTM. 🚀

@trgeiger trgeiger force-pushed the crd-update-docs branch 2 times, most recently from 4983271 to 0c07a19 Compare August 7, 2024 16:12
@trgeiger
Copy link
Contributor Author

trgeiger commented Aug 7, 2024

The example error output doesn't wrap, should I just manually place some line breaks so it's not such a long horizontal scroll?

Copy link
Contributor

@michaelryanpeter michaelryanpeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm (though I am not an owner/maintainer on the project)

Comment on lines +14 to +24
- The scope changes from Cluster to Namespace or from Namespace to Cluster
- An existing stored version of the CRD is removed
- A new required field is added to an existing version of the CRD
- An existing field is removed from an existing version of the CRD
- An existing field type is changed in an existing version of the CRD
- A new default value is added to a field that did not previously have a default value
- The default value of a field is changed
- An existing default value of a field is removed
- New enum restrictions are added to an existing field which did not previously have enum restrictions
- Existing enum values from an existing field are removed
- The minimum value of an existing field is increased in an existing version
- The maximum value of an existing field is decreased in an existing version
- Minimum or maximum field constraints are added to a field that did not previously have constraints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have played around with it too, but I haven't been able to beat your list. I also haven't been able to figure out a good information architecture.

I do feel like like it would be nice to have this be a bit more scan-able as a reference. Maybe we will come up with something when we downstream this doc?


??? failure "Error output"
```
validating upgrade for CRD "test.example.com" failed: CustomResourceDefinition test.example.com failed upgrade safety validation. "NoScopeChange" validation failed: scope changed from "Namespaced" to "Cluster"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the long line is a little hard to read, but I have always been told to keep example outputs as they are printed in the terminal.


### Removing a previous version

In this example, the only previous version, `v1alpha1`, has been replaced with the new version:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this example, the only previous version, `v1alpha1`, has been replaced with the new version:
In this example, the stored version `v1alpha1`, has been removed before it is no longer a stored version:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second clause of your suggestion is a little confusing to me. Is the distinction here that it's okay to remove a stored version if it's not in use? And not okay to move if it's being used? In which case would something like "has been removed while still in use?"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe just simply "has been removed?"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that wording sounds good. You can't remove a stored version while there are still instances of that version stored in etcd.

@everettraven everettraven added this pull request to the merge queue Aug 8, 2024
Merged via the queue into operator-framework:main with commit 5f9b27c Aug 8, 2024
17 of 18 checks passed
perdasilva pushed a commit to LalatenduMohanty/operator-controller that referenced this pull request Aug 13, 2024
perdasilva pushed a commit to kevinrizza/operator-controller that referenced this pull request Aug 13, 2024
@skattoju skattoju mentioned this pull request Sep 25, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docs: CRD Upgrade Safety
3 participants