The RAG server exposes an OpenAI-compatible API, using which developers can provide custom conversation history. For full details, see APIs for RAG Server.
Use the /generate
endpoint in the RAG server of a RAG pipeline to generate responses to prompts using custom conversation history.
To support multi-turn conversations, include the following parameters in the request body.
Parameter | Description | Type |
---|---|---|
messages | A sequence of messages that form a conversation history. Each message contains a role field, which can be user , assistant , or system , and a content field that contains the message text. |
Array |
use_knowledge_base | true to use a knowledge base; otherwise false . |
Boolean |
The following example payload includes a messages
parameter that passes a custom conversation history to /generate
endpoint for better contextual answers. You can include or change the following parameters in the request body while trying out the generate API using this notebook.
{
"messages": [
{
"role": "system",
"content": "You are an assistant that provides information about FastAPI."
},
{
"role": "user",
"content": "What is FastAPI?"
},
{
"role": "assistant",
"content": "FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints."
},
{
"role": "user",
"content": "What are the key features of FastAPI?"
}
],
"use_knowledge_base": true
}