Fix multiple issues (limit inconsistency, invisible documents #2260
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #2238 (limit inconsistency) and #2222 (invisible documents) + "Documents disappear on delete but after a delay appear again"
This pull request primarily increases the maximum
limitparameter for various API endpoints from 100 to 1000, allowing clients to request larger result sets. It also improves how collection IDs are handled during document ingestion and updates the database transaction isolation for document upserts to increase reliability. Below are the most important changes grouped by theme:API Parameter Updates
limitparameter from 100 to 1000 across multiple endpoints indocuments_router.py,collections_router.py,users_router.py, andchunks_router.py, as well as in the corresponding OpenAPI documentation (llms.txt). This change enables clients to retrieve up to 1000 objects per request instead of 100. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]Collection ID Handling in Document Ingestion
document_info.collection_idsfor assigning and propagating collection IDs, ensuring documents and chunks are correctly associated with collections. This includes changes iningestion_service.py,documents_router.py, and orchestration workflows. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]Database Reliability Improvements
serializableinupsert_documents_overviewto reduce race conditions and added handling forSerializationFailureErrorto improve retry logic during concurrent document upserts. [1] [2]Document Status Update Safeguards
ingestion_service.pyto ensure a document still exists before updating its status, preventing accidental recreation of deleted documents during ingestion.Minor Query Construction Fix
get_documents_overviewto ensure conditions are properly combined.Important
Increased API
limitparameter to 1000, improved collection ID handling, and enhanced database reliability and document status updates.limitparameter from 100 to 1000 indocuments_router.py,collections_router.py, andusers_router.pyto allow larger result sets.ingestion_service.pyanddocuments_router.pyto usedocument_info.collection_idsfor consistent collection ID assignment.serializableinupsert_documents_overviewto reduce race conditions.SerializationFailureErrorhandling for retries during document upserts.ingestion_service.pyto ensure document existence before status updates, preventing accidental recreation of deleted documents.This description was created by
for 3f97b08. You can customize this summary. It will automatically update as commits are pushed.