Skip to content

FEATURE REQUEST: Split completion configuration for chat and RAG queries vs ingestion and KG #2253

@ga-it

Description

@ga-it

Is your feature request related to a problem? Please describe.
I have put R2R behind Nextcloud as the RAG engine ( https://github.com/ga-it/context_chat_backend )
This results in massive indexing runs (100s of thousands of documents - backlog will take months to ingest)
This locks up the completions endpoints based on the concurrency settings

Describe the solution you'd like
Split out the completions (concurrency) configurations per LLM definition (i.e. fast_llm, quality_llm, etc) and task (i.e. database.graph_creation_settings, database.graph_entity_deduplication_settings, etc). Extend configuration settings for RAG queries

Describe alternatives you've considered
As far I can tell there is no way to manage this at present. Concurrency settings choke queries too

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions