Skip to content

Conversation

rohan-019
Copy link
Contributor

@rohan-019 rohan-019 commented Sep 25, 2025

Optimize binding controllers with custom event predicate

Summary

Introduces a custom predicate for binding controllers to filter events and reduce unnecessary reconciliations:

  • Ignores delete events (cleanup handled by finalizers)
  • Always processes create events
  • Processes updates only for meaningful changes (generation, DeletionTimestamp, key .spec fields)

Result: Better performance with no behavior changes.

Motivation

Current controllers watch resources with broad predicates, causing redundant reconciliations. Filtering irrelevant events reduces CPU/memory usage, API calls, and cache churn while preserving finalizer cleanup.

Changes

  • Added: ResourceBindingPredicate
  • Updated: Binding controllers to use the new predicate
  • Added: Unit tests

Predicate Logic

  • Create: Always processed
  • Delete: Ignored
  • ⚠️ Update: Only if generation changes, DeletionTimestamp is set, or critical .spec fields change (requiredBy, clusters, resource, gracefulEvictionTasks, suspension, preserveResourcesOnDeletion, conflictResolution)

Event Behavior

Event Type Before After Reason
Create ✅ Process ✅ Process Always needed
Delete ✅ Process ❌ Ignore Finalizers handle cleanup
Update (generation) ✅ Process ✅ Process Meaningful change
Update (status only) ✅ Process ❌ Ignore No reconciliation needed

Files Changed

Added:

  • pkg/controllers/binding/binding_predicate.go
  • pkg/controllers/binding/binding_predicate_test.go

Modified:

  • pkg/controllers/binding/binding_controller.go
  • pkg/controllers/binding/cluster_resource_binding_controller.go

Testing

  • Unit tests cover all event types
  • Delete events ignored as expected
  • Critical .spec changes trigger reconciliation
  • Finalizer cleanup verified via DeletionTimestamp

Backward Compatibility

  • No API changes
  • Finalizer cleanup preserved
  • Override policy watch behavior unchanged

Performance Impact

  • Fewer reconciliations
  • Reduced API calls and cache churn
  • Lower CPU/memory usage

Linked Issue

Part of: #6780

Checklist

  • Unit tests added/passing
  • No breaking changes
  • Predicate applied to both RB and CRB controllers
  • Finalizer handling verified via DeletionTimestamp path

@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zach593 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

Summary of Changes

Hello @rohan-019, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the performance of ResourceBinding and ClusterResourceBinding controllers by implementing a custom event predicate. This predicate selectively processes events, preventing redundant reconciliation loops for changes that do not impact the core logic, thereby improving resource efficiency and responsiveness across the system.

Highlights

  • Custom Event Predicate: Introduces a new ResourceBindingPredicate to intelligently filter events for ResourceBinding and ClusterResourceBinding controllers.
  • Optimized Event Handling: Delete events are now ignored (cleanup handled by finalizers), create events are always processed, and update events are processed only for meaningful changes (generation, DeletionTimestamp, or critical .spec fields).
  • Performance Improvement: The fine-grained event filtering significantly reduces unnecessary reconciliations, leading to lower CPU/memory usage, fewer API calls, and reduced cache churn.
  • No Behavior Change: The optimization is achieved without altering the existing functional behavior or backward compatibility of the controllers.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@karmada-bot karmada-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 25, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a custom predicate ResourceBindingPredicate to optimize the binding controllers by filtering events, which is a great way to reduce unnecessary reconciliations and improve performance. The implementation is clear and covers both ResourceBinding and ClusterResourceBinding. The logic correctly handles create, delete, and update events based on meaningful changes.

I have a couple of suggestions to improve code maintainability and test coverage. One is to refactor a duplicated code block in binding_predicate.go, and the other is to enhance the unit tests for ClusterResourceBinding to be more comprehensive.

Overall, this is a solid improvement. Please see my detailed comments.

@codecov-commenter
Copy link

codecov-commenter commented Sep 25, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 83.33333% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.87%. Comparing base (70411ab) to head (62be891).
⚠️ Report is 19 commits behind head on master.

Files with missing lines Patch % Lines
pkg/controllers/binding/binding_predicate.go 86.95% 6 Missing ⚠️
pkg/controllers/binding/binding_controller.go 0.00% 1 Missing ⚠️
...ers/binding/cluster_resource_binding_controller.go 0.00% 1 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6784      +/-   ##
==========================================
+ Coverage   45.74%   45.87%   +0.13%     
==========================================
  Files         689      691       +2     
  Lines       57161    57346     +185     
==========================================
+ Hits        26149    26310     +161     
- Misses      29383    29406      +23     
- Partials     1629     1630       +1     
Flag Coverage Δ
unittests 45.87% <83.33%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rohan-019
Copy link
Contributor Author

rohan-019 commented Sep 25, 2025

Hey, @XiShanYongYe-Chang
Please take a look at this!

@XiShanYongYe-Chang
Copy link
Member

Thanks a lot
@rohan-019

Copy link
Member

@XiShanYongYe-Chang XiShanYongYe-Chang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks~
Can you help fix the lint error?

@rohan-019
Copy link
Contributor Author

rohan-019 commented Sep 26, 2025

Thanks~ Can you help fix the lint error?

I tried several times to fix lint errors using golangci-lint run --fix pkg/controllers/binding/binding_predicate.go pkg/controllers/binding/cluster_resource_binding_controller.go and also attempted a manual fix, but ended up with lint errors. Could you help me figure out what’s going wrong?

@XiShanYongYe-Chang
Copy link
Member

Hi @rohan-019, since this PR is the first one in our series, I have quite a few review comments. ^-^

@rohan-019
Copy link
Contributor Author

Hi @rohan-019, since this PR is the first one in our series, I have quite a few review comments. ^-^

Hi @XiShanYongYe-Chang ,
I really appreciate your feedback, especially since this is the first PR in the series. I’ll go through all your comments carefully and make the necessary updates.

Thanks again for taking the time to review.

@XiShanYongYe-Chang
Copy link
Member

Hi @rohan-019, the newly added ut has failed.

@rohan-019
Copy link
Contributor Author

rohan-019 commented Sep 29, 2025

the newly added ut has failed.

Hey @XiShanYongYe-Chang ,
I initially removed the generation check based on your comment that the global event filter already handles it. However, after testing We found that both the new unit test (TestResourceBindingPredicate_Update/generation_change) and the related e2e test failed once this check was removed.

The reason is that once we override UpdateFunc with custom logic, the default GenerationChangedPredicate is no longer applied automatically. This means spec-only changes are not triggering reconciliation unless we explicitly handle the generation change inside our predicate.

To keep the behavior correct and align with the current tests, I’ve restored the != generation check in UpdateFunc.

@XiShanYongYe-Chang
Copy link
Member

We found that both the new unit test (TestResourceBindingPredicate_Update/generation_change) and the related e2e test failed once this check was removed.

This test case is problematic because it is unlikely that only the generation will change while other fields remain unchanged. Therefore, it needs to be redesigned or removed.

@XiShanYongYe-Chang
Copy link
Member

To keep the behavior correct and align with the current tests, I’ve restored the != generation check in UpdateFunc.

There's no need to do this.

@rohan-019
Copy link
Contributor Author

rohan-019 commented Sep 29, 2025

We found that both the new unit test (TestResourceBindingPredicate_Update/generation_change) and the related e2e test failed once this check was removed.

This test case is problematic because it is unlikely that only the generation will change while other fields remain unchanged. Therefore, it needs to be redesigned or removed.

That makes sense, as the Generation changes are handled globally, so this predicate doesn't need to check for them.
But instead of removing the generation check, I redesigned so that this predicate now focuses on catching edge cases where:
-Generation hasn't changed BUT
-Specific critical spec fields have changed (like RequiredBy, Clusters, etc.)

@rohan-019 rohan-019 closed this Sep 29, 2025
@rohan-019 rohan-019 reopened this Sep 29, 2025
@rohan-019
Copy link
Contributor Author

Hey @XiShanYongYe-Chang,
Don't mind, I accidentally closed PR!!

@rohan-019
Copy link
Contributor Author

Hi @XiShanYongYe-Chang
I tried simplifying the deletion check from
e.ObjectOld.GetDeletionTimestamp() == nil && e.ObjectNew.GetDeletionTimestamp() != nil
to
e.ObjectOld.GetDeletionTimestamp() != e.ObjectNew.GetDeletionTimestamp()

as suggested, but this change caused failures in the e2e tests (e.g., migration_and_rollback_test.go timing out).

The reason is that the simplified form triggers reconciliation not only when deletion starts (nil → non-nil), but also if the deletion timestamp is updated while the object is already marked for deletion. That extra trigger led to unexpected reconcile behavior and race conditions, which is why the e2e tests failed.

So it seems we need to keep the original, more restrictive check to ensure reconciliation only fires once at the beginning of deletion.

@XiShanYongYe-Chang
Copy link
Member

The reason is that the simplified form triggers reconciliation not only when deletion starts (nil → non-nil), but also if the deletion timestamp is updated while the object is already marked for deletion. That extra trigger led to unexpected reconcile behavior and race conditions, which is why the e2e tests failed.

Once the timestamp is set, it will not be modified again.

The E2E error should be just an occasional issue and likely not introduced by the current changes. Let's ignore it for now.

Can you help fix the lint errors and compress the commits into one?

@rohan-019 rohan-019 force-pushed the optimization_binding-controller branch from 8ba789f to 6ef33ba Compare September 29, 2025 10:53
Copy link
Member

@XiShanYongYe-Chang XiShanYongYe-Chang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still some minor issues, but everything else is basically fine. 👍

@rohan-019 rohan-019 force-pushed the optimization_binding-controller branch from 375bf2f to 127e00c Compare September 29, 2025 12:19
@rohan-019 rohan-019 force-pushed the optimization_binding-controller branch from d44400a to 7346443 Compare September 30, 2025 05:53
Signed-off-by: rohan-019 <[email protected]
Signed-off-by: rohan-019 <[email protected]>
@rohan-019 rohan-019 force-pushed the optimization_binding-controller branch from 7346443 to 62be891 Compare September 30, 2025 05:58
Copy link
Member

@XiShanYongYe-Chang XiShanYongYe-Chang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot~ 👍
/lgtm

/cc @RainbowMango @zach593 @zhzhuang-zju

@karmada-bot karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 30, 2025
@RainbowMango
Copy link
Member

I guess this PR is part of #6780, so I updated the Fixes to Part of in the PR description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants