-
Notifications
You must be signed in to change notification settings - Fork 817
Add querier.ingester-query-max-attempts to retry on partial data. #6714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add querier.ingester-query-max-attempts to retry on partial data. #6714
Conversation
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
85948b6
to
761741f
Compare
Signed-off-by: Justin Jung <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should document clearly and provide some suggestions on what's the recommended set up for this retry value.
To me this flag is kind of overlapped with partial data and I am not sure how much the retry helps for most of the usecase.
We know it is unlikely to miss series so we return partial data with 4XX. The retry may succeed but given we only wait short period time and Ingester queries are usually within ms, I am unsure if it is worth it to retry more espeically if Ingesters are high load or ongoing deployment
@@ -192,6 +202,33 @@ func (q *distributorQuerier) streamingSelect(ctx context.Context, sortSeries, pa | |||
return seriesSet | |||
} | |||
|
|||
func (q *distributorQuerier) queryWithRetry(ctx context.Context, queryFunc func() (*client.QueryStreamResponse, error)) (*client.QueryStreamResponse, error) { | |||
if q.ingesterQueryMaxAttempts == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to handle case of ingesterQueryMaxAttempts set to 0 as it retries forever IIUC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the validation in config.Validate
:
if cfg.IngesterQueryMaxAttempts < 1 {
return errInvalidIngesterQueryMaxAttempts
}
But would you still want the condition to be changed to if q.ingesterQueryMaxAttempts <= 1 {
?
"github.com/cortexproject/cortex/pkg/util/chunkcompat" | ||
"github.com/cortexproject/cortex/pkg/util/spanlogger" | ||
) | ||
|
||
const retryMinBackoff = time.Second | ||
const retryMaxBackoff = 5 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ingester queries usually finish in ms. I am not sure if it is worth it to wait for 1s ~ 5s backoff retry as it may cause more issues like increased inflight queries on Ingesters.
What this PR does:
In #6526, new configs
query_partial_data
andrules_partial_data
were added which allows tenants to receive 2xx with a warning message when the data accuracy is relatively high in zone-aware setting.This PR adds retry logic in querier getting data from ingesters, retrying the requests if the response is partial data. The new configuration,
querier.ingester-query-max-attempts
, allows ingester queries to be retried. Default is set to 1.Which issue(s) this PR fixes:
n/a
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]