Skip to content

Conversation

@rexjohannes
Copy link

@rexjohannes rexjohannes commented Sep 17, 2025

Closes #2257

This pull request updates the document retrieval logic in the documents_in_collection method to more accurately reflect the collections a document belongs to. Instead of returning only the queried collection ID, it now returns all collection IDs associated with each document.

Improvements to document collection handling:

  • Updated the SQL query in collections.py to select the collection_ids field directly from the documents table, ensuring all relevant collection IDs are retrieved for each document.
  • Modified the construction of DocumentResponse objects to use the actual collection_ids from the database row, rather than just the queried collection ID.

Important

Updates documents_in_collection in collections.py to return all collection IDs for each document, improving document retrieval logic.

  • Behavior:
    • Updates documents_in_collection in collections.py to return all collection IDs for each document, not just the queried one.
    • Modifies SQL query to select collection_ids from documents table.
    • Adjusts DocumentResponse construction to use collection_ids from database row.

This description was created by Ellipsis for 5f630cc. You can customize this summary. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 5f630cc in 45 seconds. Click for details.
  • Reviewed 21 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. py/core/providers/database/collections.py:312
  • Draft comment:
    Good: the SQL query now explicitly selects d.collection_ids. Ensure that the column always contains the full list of collection IDs as expected.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. py/core/providers/database/collections.py:327
  • Draft comment:
    Updating DocumentResponse to use row['collection_ids'] correctly returns the full list of collection IDs. Confirm that the DB returns a non-null list.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_h1e5rxGSiaV2jw1E

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@rexjohannes rexjohannes changed the title Fix https://github.com/SciPhi-AI/R2R/issues/2257 Fix #2257 Sep 17, 2025
@rexjohannes
Copy link
Author

@NolanTrem can you merge this and #2260 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

List documents in collection returns documents containing only the requested collectionId

1 participant