-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove usage of cluster level setting for circuit breaker #2567
base: main
Are you sure you want to change the base?
Conversation
2318611
to
3eaf8b3
Compare
Simplification of circuit breaker management. Before, we were periodically updating the circuit breaker cluster setting to have a global circuit breaker. However, this is inconvenient and not actually useful. This change removes that logic and only trips based on node level circuit breaker. For knn stats, in order to fetch the cluster level circuit breaker, a transport call is made to check all of the nodes. This isnt super efficient but its made for stats calls so its not on the critical path. Signed-off-by: John Mazanec <[email protected]>
3eaf8b3
to
6365da6
Compare
} | ||
} | ||
}; | ||
this.threadPool.scheduleWithFixedDelay(runnable, TimeValue.timeValueSeconds(CB_TIME_INTERVAL), ThreadPool.Names.GENERIC); | ||
threadPool.scheduleWithFixedDelay(runnable, TimeValue.timeValueSeconds(CB_TIME_INTERVAL), ThreadPool.Names.GENERIC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could go even a step further an move this to the native cache manager to simplfiy things more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change - I think its fairly straightforward and completely removes the potential circular dependency.
@Override | ||
public Boolean get() { | ||
try { | ||
KNNCircuitBreakerTrippedResponse knnCircuitBreakerTrippedResponse = client.execute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: This is going to have to somehow be made to be async, which will complicate things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change. Its a bit messy - will clean up a bit.
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
import java.util.Map; | ||
import java.util.function.Function; | ||
|
||
public class CircuitBreakerStat extends KNNStat<Boolean> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: In mixed cluster state, this should fall back to reading from the (deprecated) index setting.
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
c91a084
to
019429a
Compare
Signed-off-by: John Mazanec <[email protected]>
b910f38
to
df687f7
Compare
Description
Simplification of circuit breaker management. Before, we were periodically updating the circuit breaker cluster setting to have a global circuit breaker. However, this is inconvenient and not actually useful.
This change removes that logic and only trips based on node level circuit breaker. For knn stats, in order to fetch the cluster level circuit breaker, a transport call is made to check all of the nodes. This isnt super efficient but its made for stats calls so its not on the critical path.
Overall, this greatly simplfiies the code. However, it does mean that the cluster setting "knn.circuit_breaker.triggered" will no longer have an effect. But, I think for 3.0 this is okay. For users interested in this info, they can get it from the stats API.
Still fixing up tests pending, but I wanted to get opinions early on the change before investing that time to fix.
Related Issues
Resolves #1607
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.