[KEP-5073] Update KEP with section of validation ratcheting by yongruilin · Pull Request #5292 · kubernetes/enhancements

yongruilin · 2025-05-08T17:01:02Z

One-line PR description: This PR updates the KEP about how we intend to handle validation ratcheting with Declarative Validation.
Issue link: Declarative Validation Of Kubernetes Native Types With validation-gen #5073

k8s-ci-robot · 2025-05-08T17:01:04Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

yongruilin · 2025-05-08T19:39:44Z

/assign @jpbetz @deads2k

deads2k

Question and a hopefully non-contentious request. lgtm otherwise

deads2k · 2025-05-21T19:36:46Z

keps/sig-api-machinery/5073-declarative-validation-with-validation-gen/README.md

+| Struct fields<br/>`structType=atomic`  | -                                                              | revalidate changed fields                                                      | [Issue #131566](https://github.com/kubernetes/kubernetes/issues/131566) (Alignment needed)                     |
+| Slices                              | all elements are equivalent and the is order unchanged         | revalidate the slice if any element changed or order changed                   | `listType=map`: no validate when re-order<br/>`listType=set`: re-validate when re-order<br/>`listType=atomic`: re-validate when re-order |
+| Slice values<br/>`listType=atomic`   | -                                                              | validate items which are not found (by value) in the old slice                 | Validate all elements (CRDs ratcheting may be expanded to match in-tree ratcheting)                                  |
+| Slice values<br/>`listType=map`      | -                                                              | (re)validate items which are not found (by key) in the old slice or are changed | same                                                                                                                |


if order of a slice that is a map changes, but all the keys and values are the same, do we skip validation? I would expect us to skip.

it would skip for the validation on the individual element. But should we still validate the listmap as a whole? If it is declared with "immutable", we consider the order as part of the API and validate the order?

it would skip for the validation on the individual element. But should we still validate the listmap as a whole? If it is declared with "immutable", we consider the order as part of the API and validate the order?

Do you have an example? I think for our validation tag purposes I would expect immutable on a list of type map to mean the key/value pairs cannot change, but I don't think I'd expect revalidation on ordering. That means that certain kinds of validation cannot be expressed, but it's not so clear to me that we'd want to support validation on lists of type map that were order dependent.

Container.Env
The order here matters as "Variables making use of others defined in the same context must come later in the list." (ref doc) This is maybe a mistake for marking it as listtype=map, but it is already in used widely.

pod.spec.container[*].env is also the case I wanted to make sure I fully considered as I was when reading this thread.

env is problematic because it represents variable assignments (order matters,repeated "keys" are allowed). It breaks listType=map in quite a few ways.

Despite all of those problems, I'm not aware of any validation rules that would need to be reevaluated if the order changes for env. And I think that's what we're looking for here. If we can't find any actual cases. I'm inclined to not reevaluated validation when only order changes like @deads2k suggests.

Self-followup:

I don't know of any places where the validation logic depends on the order of items in a list-map or list-set, and I am hard-pressed to synthesize a realistic one (I can dream up phony ones, though). env doesn't count here because it's REALLY NOT a map.

Likewise, I can't really dream up a case where a reorder-only update is a thing someone would want to do.

That makes this discussion somewhat academic, but it changes core principles so I think we need to pin it down.

For me it mostly comes down to consistency. If we write to storage and trigger admission and send WATCH events, then I think we should validate. You can argue that it's a map, so any validation that would fail on a reorder is broken, and I not even going to disagree.

The second concern is complexity. To skip validation in the event of a reorder-only update, we need to write more code and/or do slow things like maps and write more tests than if we just let the revalidation happen. I doubt very very much that reorder-only updates are a thing people do, so it's really moot.

Why write more code to skip a corner-case that doesn't really exist?

OK here's a semi-real case.

Service.Spec.ClusterIPs (it's not declared as a set but it probably should be - it has to be unique elements). There's a rule that says that .clusterIPs[0] must == .clusterIP. If you reorder that list, validation SHOULD fail.

OK here's a semi-real case.
Service.Spec.ClusterIPs (it's not declared as a set but it probably should be - it has to be unique elements). There's a rule that says that .clusterIPs[0] must == .clusterIP. If you reorder that list, validation SHOULD fail.

I think a distinguishing feature of a set or a map is that the order doesn't matter. Re-validating changes in those cases allows people to write not-map and not-set validation. For ClusterIPs, I counter-claim that someone reading the type, seeing set, treating it as a set, and then getting validation errors incompatible with a set is a bug in the type description and a kind of bug we can make impossible to write by not revalidating.

We've got a variety of semantics in the use cases. Regardless of what we pick for the default ratcheting behavior of listType=map, we're going to have to deal with the use cases that don't align.

I don't love that we've accumulated order dependent use cases on listType=map. But I can see how we got here:

We alphanumerically sort JSON maps when we serialize, but we preserve the order of listType=map.

When types were annotated as "maps" by SMD and SSA, no special considerations were made for lists where order matters.

If we continue to force uniform treatment of all listType=map entries have the same semantics, I don't see the situation getting a whole lot better.

Would it be reasonable to do something like this?

Clarify (presumably with more tags) all use cases of listType=maps where order matters.

Once we've divided the use cases up, provide sane default ratcheting for each category.

?

I assert that these are not "maps" (or sets), they are "kubernetes list-maps", so the semantics are really up to us to define.

We could define different attributes like "// +listType=set" vs "// +unique". Maps are more fiddly since the way we define keys is called listMapKey. I suppose we could do "orderedSet" and "orderedMap", but I think we would use the ordered variants for all existing-field conversions, just out of fear of breaking clients' assumptions.

If we can find better ways to express things like env which doesn't break clients but also doesn't carry untrue implications, great! But that's not what this is about. As it stands, env is a bad candidate for being converted to declarative :)

Let's assume we filter out all the things that are declared as maps/sets but really aren't. For the rest: would you be open to changes which provide "real" map semantics? IOW: don't write to disk (but do return 200), sort the elements by key (e.g. in JSON/YAML), don't trigger WATCH, don't call admission, etc? I suspect that will 100% break someone.

IMO consistency and simplicity are important. Pragmatically, less code has less bugs. I don't see the value in supporting "map" and "orderedMap" as different things.

deads2k · 2025-05-21T19:38:34Z

keps/sig-api-machinery/5073-declarative-validation-with-validation-gen/README.md

+A challenge arises if a cross-field validation rule (e.g. `X < Y`) is defined on a common ancestor struct, and an unrelated field (e.g. `Z`) within that same ancestor is modified. This change to `Z` makes the common ancestor “changed” overall, triggering re-validation of the `X < Y` rule. If this rule was recently evolved (e.g., made stricter), it might now fail even if `X` and `Y` themselves are not modified by the user’s update. This could violate the principle “Unchanged fields do not cause update rejections.”
+
+For the initial implementation, this behavior will be documented. Furthermore, cross-field validation rules themselves must incorporate internal ratcheting logic. For instance, generated code for dedicated cross-field tags (like `+k8s:unionMember`) will be designed to only act upon changes to the specific fields they govern and were designed to validate.


Please build us an example as soon as we have enough merged so we can see how painful (or not) it seems to be.

keps/sig-api-machinery/5073-declarative-validation-with-validation-gen/README.md

deads2k · 2025-06-09T13:53:06Z

We had a lengthy discussion Friday evening and that we will reject any new PRs that introduce declarative validation rules for list-type=map that are dependent on the order of elements. We cannot think of an automated test that can enforce this, so we will need to vigilant.

Once we get the KEP updated to indicate this, I'm happy with the changes.

…ve validation section

deads2k · 2025-06-09T17:39:56Z

Thanks for the updates. lgtm

/approve
/assign @thockin

I'll leave the lgtm with Tim so he also has a chance to see the updated verbiage.

thockin

Thanks!

/lgtm
/approve
/unhold

k8s-ci-robot · 2025-06-09T21:45:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, thockin, yongruilin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~keps/sig-api-machinery/OWNERS~~ [deads2k]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot requested review from apelisse and sttts May 8, 2025 17:01

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 8, 2025

yongruilin force-pushed the ratcheting branch from 34832d2 to 711324f Compare May 8, 2025 17:45

yongruilin marked this pull request as ready for review May 8, 2025 17:45

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 8, 2025

k8s-ci-robot requested review from deads2k and jpbetz May 8, 2025 17:46

yongruilin force-pushed the ratcheting branch 2 times, most recently from 912ee4e to 34dae9a Compare May 8, 2025 18:26

yongruilin closed this May 8, 2025

yongruilin force-pushed the ratcheting branch from 34dae9a to 4ab2fca Compare May 8, 2025 18:32

yongruilin reopened this May 8, 2025

aaron-prindle mentioned this pull request May 8, 2025

KEP-5073: Declarative Validation: Explain and update document with cross-field validation information #5290

Merged

k8s-ci-robot assigned deads2k and jpbetz May 8, 2025

deads2k reviewed May 21, 2025

View reviewed changes

thockin reviewed Jun 5, 2025

View reviewed changes

keps/sig-api-machinery/5073-declarative-validation-with-validation-gen/README.md Outdated Show resolved Hide resolved

thockin self-assigned this Jun 5, 2025

[KEP-5073] Update KEP with section of validation ratcheting

01f1372

yongruilin force-pushed the ratcheting branch from 4ab2fca to add5abd Compare June 9, 2025 17:30

KEP-5073: Add policy for order-sensitive list validation in declarati…

04e333a

…ve validation section

yongruilin force-pushed the ratcheting branch from add5abd to 04e333a Compare June 9, 2025 17:38

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 9, 2025

apply feedback

f9f82d5

thockin reviewed Jun 9, 2025

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 9, 2025

k8s-ci-robot merged commit fdce07a into kubernetes:master Jun 9, 2025
4 checks passed

k8s-ci-robot added this to the v1.34 milestone Jun 9, 2025

aaron-prindle mentioned this pull request Jun 11, 2025

KEP-5073: PRR update for v1.34 #5396

Merged

3 tasks

yongruilin mentioned this pull request Jun 11, 2025

feat(validation-gen): add default validation ratcheting support kubernetes/kubernetes#132236

Merged

yongruilin mentioned this pull request Jun 23, 2025

feat(validation-gen): Enhance validation with new rules and core refactoring kubernetes/kubernetes#132469

Merged

aaron-prindle mentioned this pull request Jul 17, 2025

feat: add CSR status.conditions approved+denied declarative validation tags to v1 and v1beta1 and add associated tests kubernetes/kubernetes#133013

Merged

		A challenge arises if a cross-field validation rule (e.g. `X < Y`) is defined on a common ancestor struct, and an unrelated field (e.g. `Z`) within that same ancestor is modified. This change to `Z` makes the common ancestor “changed” overall, triggering re-validation of the `X < Y` rule. If this rule was recently evolved (e.g., made stricter), it might now fail even if `X` and `Y` themselves are not modified by the user’s update. This could violate the principle “Unchanged fields do not cause update rejections.”

		For the initial implementation, this behavior will be documented. Furthermore, cross-field validation rules themselves must incorporate internal ratcheting logic. For instance, generated code for dedicated cross-field tags (like `+k8s:unionMember`) will be designed to only act upon changes to the specific fields they govern and were designed to validate.

Conversation

yongruilin commented May 8, 2025

Uh oh!

k8s-ci-robot commented May 8, 2025

Uh oh!

yongruilin commented May 8, 2025

Uh oh!

deads2k left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jpbetz May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

deads2k commented Jun 9, 2025

Uh oh!

deads2k commented Jun 9, 2025

Uh oh!

thockin left a comment

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Jun 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jpbetz May 28, 2025 •

edited

Loading