-
Notifications
You must be signed in to change notification settings - Fork 1k
Fix the issue that attached resource changes were not synchronized to the cluster in dependencies distributor #6931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @XiShanYongYe-Chang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug in the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly fixes a bug in the matchesWithBindingDependencies function within the dependencies distributor. Previously, the function would prematurely exit its loop upon encountering the first non-matching dependency, instead of iterating through the entire list of dependencies. The updated logic now correctly uses continue to iterate through all dependencies, ensuring a comprehensive match check. The change also improves error handling for label selector parsing by continuing on failure instead of aborting, and adds a necessary nil check for the selector to prevent potential panics. These changes are accurate, enhance the robustness of the dependency distribution logic, and the code quality is good. I have no further suggestions.
dfe4d85 to
2b78246
Compare
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6931 +/- ##
==========================================
+ Coverage 46.56% 46.61% +0.04%
==========================================
Files 699 699
Lines 48130 48139 +9
==========================================
+ Hits 22411 22438 +27
+ Misses 24018 24004 -14
+ Partials 1701 1697 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| // If convert LabelSelector failed, retrying with an error return will not solve the problem. | ||
| // It will only increase the consumption by repeatedly listing the binding. | ||
| // Therefore, it is better to print this error and ignore it. | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. However, we should double-check: could a failure to convert LabelSelector to a selector cause intermittent test failures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't understand what you meant. Could you explain that in more detail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, if a failure to convert LabelSelector can be resolved by retrying, then we should return an error and trigger a retry. However, if the conversion failure is deterministic, then the current handling approach is perfectly reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XiShanYongYe-Chang I want to know if there is any historical reason for ignoring this error? (I see the Unmarshal above also ignored the error.)
If this is a persistent error, ignoring it will hide the real bug, it's better to expose the issue as early as possible. We can predict that even if we retry this error, the cost is just multiple attempts.
Can we add some validation checks in the webhook? If it can pass the validation, then this is likely an intermittent error, and we can retry with more confidence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to know if there is any historical reason for ignoring this error? (I see the Unmarshal above also ignored the error.)
There was no particular historical reason; the initial revision of the PR was: #3827. The reason for the modification is as stated in the comment.
If this is a persistent error, ignoring it will hide the real bug, it's better to expose the issue as early as possible. We can predict that even if we retry this error, the cost is just multiple attempts.
The ignore here is intended to avoid wasting subsequent useless reconciles, not to cover up any mistakes. Once an error occurs here, it indicates that the label selector was written incorrectly (for the unmarshal operation mentioned above, no error would occur unless the user modifies the annotationin the binding object). Retrying cannot resolve the issue and would waste the execution of the reconcile process.
Can we add some validation checks in the webhook?
I think it's feasible. We can modify VerifyDependencies to perform correctness checks on the configv1alpha1.DependentObjectReference returned by the resource interpreter. If possible, we can create a new PR to address this issue separately.
Do we also need to validate the content of the unmarshal operation mentioned above? For example, validating the update operation of the binding object in karmada-webhook. This is done to prevent users from mistakenly modifying the annotation dependencies annotation dependenciesAnnotationKey key on the binding object. I'm not sure how much value this adds.
If it can pass the validation, then this is likely an intermittent error, and we can retry with more confidence.
The error here is not an intermittent one; once it occurs, it will consistently be present. Therefore, I think that the current modification is more appropriate. Adding validation operations can further prevent the issue from arising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @RainbowMango
Does this logic need further processing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take a look tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think so.
The responsibility of the function matchesWithBindingDependencies tells if given resource template is required by a specific ResourceBinding; in case of any error arising, it shouldn't swallow it, it should return the error instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still maintain my viewpoint that not all errors necessarily require a rollback. If this is an occasional client access timeout error, then returning an error and retrying would be effective. However, since this error is bound to occur, retrying would be meaningless.
But in order to smoothly advance the current PR, let me make some adjustments to the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RainbowMango updated.
|
/retest |
RainbowMango
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/assign
2b78246 to
af53465
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively fixes a bug where changes in attached resources were not being synchronized. The core of the fix is to properly propagate errors during dependency matching, which ensures that reconciliation is retried. The logic for matching dependencies has also been improved to correctly iterate through all dependencies instead of exiting prematurely. Additionally, the test suite has been significantly enhanced with more comprehensive test cases, which is a great improvement. The changes are well-implemented and address the issue correctly. I have a few minor suggestions for code refinement.
Signed-off-by: changzhen <[email protected]>
af53465 to
0917f6d
Compare
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #6924
Special notes for your reviewer:
Does this PR introduce a user-facing change?: