-
Notifications
You must be signed in to change notification settings - Fork 6
fix: add exponential retry logic to stackit embedder #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: deps-main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request enhances the StackitEmbedder
with robust retry logic using exponential backoff to handle transient embedding API failures. The changes improve reliability and observability while making retry behavior configurable.
Key changes:
- Added exponential backoff retry logic to the
embed_documents
method with configurable parameters - Introduced new settings for retry behavior: max retries, base delay, and maximum delay
- Enhanced error handling and logging for better observability of embedding failures
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
stackit_embedder_settings.py | Added three new configuration fields for retry behavior with defaults |
stackit_embedder.py | Implemented exponential backoff retry logic with comprehensive error handling and logging |
values.yaml | Added environment variable configuration for the new retry parameters |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
libs/rag-core-api/src/rag_core_api/impl/embeddings/stackit_embedder.py
Outdated
Show resolved
Hide resolved
) | ||
return [data.embedding for data in responses.data] | ||
|
||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching all exceptions with a bare Exception
is too broad. Consider catching specific exceptions like openai.APIError
, openai.RateLimitError
, or requests.RequestException
to avoid retrying on non-recoverable errors like authentication failures.
except Exception as e: | |
except (APIError, RateLimitError, Timeout, APIConnectionError) as e: |
Copilot uses AI. Check for mistakes.
…edder.py Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good with one suggestion on handling rate limited requests.
|
||
# Calculate exponential backoff delay | ||
# Calculate exponential backoff delay (cap exponent to prevent overflow) | ||
delay = min(self._settings.retry_base_delay * (2**min(attempt, 10)), self._settings.retry_max_delay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion(rate-limits): you could investigate the response code and use the x-ratelimit-reset-requests
and x-ratelimit-reset-tokens
header on 429
responses to set the delay. This would reduce client errors and handle rate limits as expected.
To get the headers, you could do something like:
raw_response = openai_client.embeddings.with_raw_response.create(..)
response_headers = raw_response.headers
This pull request introduces robust retry logic for embedding requests in the
StackitEmbedder
and refactors configuration management for external services in the RAG infrastructure. The main changes are the addition of exponential backoff for embedding failures, new retry-related settings, and a more organized approach to service configuration invalues.yaml
.Embedding robustness and configuration enhancements:
Retry logic improvements:
embed_documents
method inStackitEmbedder
, allowing up to a configurable number of retries with delays on transient failures. This improves reliability when the embedding API is temporarily unavailable.stackit_embedder.py
for better observability.Embedder settings:
StackitEmbedderSettings
formax_retries
,retry_base_delay
, andretry_max_delay
, allowing fine-grained control of retry behavior. [1] [2]Infrastructure/service configuration refactor:
infrastructure/rag/values.yaml
to move service deployment and connection details (PostgreSQL, Redis, ClickHouse, S3/MinIO) under dedicated configuration sections, clarifying which services are external and improving maintainability. [1] [2]langfuse
configuration block with explicit image, web, worker, authentication, and environment variable settings for more granular control over deployment.Embedder deployment configuration:
STACKIT_EMBEDDER_MAX_RETRIES
,STACKIT_EMBEDDER_RETRY_BASE_DELAY
,STACKIT_EMBEDDER_RETRY_MAX_DELAY
) to thestackitEmbedder
section invalues.yaml
to enable configuration via environment variables.