Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove usage of cluster level setting for circuit breaker #2567

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jmazanec15
Copy link
Member

Description

Simplification of circuit breaker management. Before, we were periodically updating the circuit breaker cluster setting to have a global circuit breaker. However, this is inconvenient and not actually useful.

This change removes that logic and only trips based on node level circuit breaker. For knn stats, in order to fetch the cluster level circuit breaker, a transport call is made to check all of the nodes. This isnt super efficient but its made for stats calls so its not on the critical path.

Overall, this greatly simplfiies the code. However, it does mean that the cluster setting "knn.circuit_breaker.triggered" will no longer have an effect. But, I think for 3.0 this is okay. For users interested in this info, they can get it from the stats API.

Still fixing up tests pending, but I wanted to get opinions early on the change before investing that time to fix.

Related Issues

Resolves #1607

Check List

  • Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@jmazanec15 jmazanec15 added Refactoring Improve the design, structure, and implementation while preserving its functionality v3.0.0 labels Feb 27, 2025
Simplification of circuit breaker management. Before, we were
periodically updating the circuit breaker cluster setting to have a
global circuit breaker. However, this is inconvenient and not actually
useful.

This change removes that logic and only trips based on node level
circuit breaker. For knn stats, in order to fetch the cluster level
circuit breaker, a transport call is made to check all of the nodes.
This isnt super efficient but its made for stats calls so its not on the
critical path.

Signed-off-by: John Mazanec <[email protected]>
}
}
};
this.threadPool.scheduleWithFixedDelay(runnable, TimeValue.timeValueSeconds(CB_TIME_INTERVAL), ThreadPool.Names.GENERIC);
threadPool.scheduleWithFixedDelay(runnable, TimeValue.timeValueSeconds(CB_TIME_INTERVAL), ThreadPool.Names.GENERIC);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could go even a step further an move this to the native cache manager to simplfiy things more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change - I think its fairly straightforward and completely removes the potential circular dependency.

@Override
public Boolean get() {
try {
KNNCircuitBreakerTrippedResponse knnCircuitBreakerTrippedResponse = client.execute(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: This is going to have to somehow be made to be async, which will complicate things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change. Its a bit messy - will clean up a bit.

import java.util.Map;
import java.util.function.Function;

public class CircuitBreakerStat extends KNNStat<Boolean> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: In mixed cluster state, this should fall back to reading from the (deprecated) index setting.

Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Refactoring Improve the design, structure, and implementation while preserving its functionality v3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove cluster setting for native memory circuit breaker
1 participant