Skip to content

Conversation

@XiShanYongYe-Chang
Copy link
Member

What type of PR is this?

/kind bug

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #6924

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

`karmada-controller-manager`: Fixed the issue that attached resource changes were not synchronized to the cluster in the dependencies distributor

@karmada-bot karmada-bot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 17, 2025
@gemini-code-assist
Copy link

Summary of Changes

Hello @XiShanYongYe-Chang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the dependencies distributor component of karmada-controller-manager where changes to attached resources were not being synchronized to the cluster. The fix involves refining the dependency matching logic to ensure that all defined dependencies are properly evaluated, preventing premature exits from the matching function and thereby guaranteeing accurate resource synchronization.

Highlights

  • Enhanced Dependency Matching: The matchesWithBindingDependencies function now thoroughly evaluates all defined dependencies, ensuring that attached resource changes are correctly synchronized.
  • Robust Error Handling: Improved error handling for label selectors within dependency definitions, preventing a single malformed selector from halting the entire matching process.
  • Comprehensive Evaluation: The logic now iterates through all dependencies, returning true if any dependency matches, rather than exiting prematurely.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@karmada-bot karmada-bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Nov 17, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a bug in the matchesWithBindingDependencies function within the dependencies distributor. Previously, the function would prematurely exit its loop upon encountering the first non-matching dependency, instead of iterating through the entire list of dependencies. The updated logic now correctly uses continue to iterate through all dependencies, ensuring a comprehensive match check. The change also improves error handling for label selector parsing by continuing on failure instead of aborting, and adds a necessary nil check for the selector to prevent potential panics. These changes are accurate, enhance the robustness of the dependency distribution logic, and the code quality is good. I have no further suggestions.

@karmada-bot karmada-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 17, 2025
@codecov-commenter
Copy link

codecov-commenter commented Nov 17, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 84.21053% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.61%. Comparing base (a14c8af) to head (0917f6d).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
...ependenciesdistributor/dependencies_distributor.go 84.21% 2 Missing and 1 partial ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6931      +/-   ##
==========================================
+ Coverage   46.56%   46.61%   +0.04%     
==========================================
  Files         699      699              
  Lines       48130    48139       +9     
==========================================
+ Hits        22411    22438      +27     
+ Misses      24018    24004      -14     
+ Partials     1701     1697       -4     
Flag Coverage Δ
unittests 46.61% <84.21%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

// If convert LabelSelector failed, retrying with an error return will not solve the problem.
// It will only increase the consumption by repeatedly listing the binding.
// Therefore, it is better to print this error and ignore it.
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. However, we should double-check: could a failure to convert LabelSelector to a selector cause intermittent test failures?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't understand what you meant. Could you explain that in more detail?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if a failure to convert LabelSelector can be resolved by retrying, then we should return an error and trigger a retry. However, if the conversion failure is deterministic, then the current handling approach is perfectly reasonable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XiShanYongYe-Chang I want to know if there is any historical reason for ignoring this error? (I see the Unmarshal above also ignored the error.)

If this is a persistent error, ignoring it will hide the real bug, it's better to expose the issue as early as possible. We can predict that even if we retry this error, the cost is just multiple attempts.

Can we add some validation checks in the webhook? If it can pass the validation, then this is likely an intermittent error, and we can retry with more confidence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to know if there is any historical reason for ignoring this error? (I see the Unmarshal above also ignored the error.)

There was no particular historical reason; the initial revision of the PR was: #3827. The reason for the modification is as stated in the comment.

If this is a persistent error, ignoring it will hide the real bug, it's better to expose the issue as early as possible. We can predict that even if we retry this error, the cost is just multiple attempts.

The ignore here is intended to avoid wasting subsequent useless reconciles, not to cover up any mistakes. Once an error occurs here, it indicates that the label selector was written incorrectly (for the unmarshal operation mentioned above, no error would occur unless the user modifies the annotationin the binding object). Retrying cannot resolve the issue and would waste the execution of the reconcile process.

Can we add some validation checks in the webhook?

I think it's feasible. We can modify VerifyDependencies to perform correctness checks on the configv1alpha1.DependentObjectReference returned by the resource interpreter. If possible, we can create a new PR to address this issue separately.

Do we also need to validate the content of the unmarshal operation mentioned above? For example, validating the update operation of the binding object in karmada-webhook. This is done to prevent users from mistakenly modifying the annotation dependencies annotation dependenciesAnnotationKey key on the binding object. I'm not sure how much value this adds.

If it can pass the validation, then this is likely an intermittent error, and we can retry with more confidence.

The error here is not an intermittent one; once it occurs, it will consistently be present. Therefore, I think that the current modification is more appropriate. Adding validation operations can further prevent the issue from arising.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @RainbowMango
Does this logic need further processing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a look tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so.
The responsibility of the function matchesWithBindingDependencies tells if given resource template is required by a specific ResourceBinding; in case of any error arising, it shouldn't swallow it, it should return the error instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still maintain my viewpoint that not all errors necessarily require a rollback. If this is an occasional client access timeout error, then returning an error and retrying would be effective. However, since this error is bound to occur, retrying would be meaningless.

But in order to smoothly advance the current PR, let me make some adjustments to the PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RainbowMango updated.

@XiShanYongYe-Chang
Copy link
Member Author

/retest

Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign

@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from rainbowmango. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@XiShanYongYe-Chang
Copy link
Member Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively fixes a bug where changes in attached resources were not being synchronized. The core of the fix is to properly propagate errors during dependency matching, which ensures that reconciliation is retried. The logic for matching dependencies has also been improved to correctly iterate through all dependencies instead of exiting prematurely. Additionally, the test suite has been significantly enhanced with more comprehensive test cases, which is a great improvement. The changes are well-implemented and address the issue correctly. I have a few minor suggestions for code refinement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Attached resource changes were not synchronized to the cluster in dependencies distributor

5 participants