-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Make ChatClient and Advisor APIs more robust, consistent, and flexible #2655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
- Introduce “ChatClientRequest” and “ChatClientResponse” for propagating requests/responses in a ChatClient advisor chain. - Structure a Prompt at the beginning of the chain, to ensure a consistent view across execution chain and observations. Any template is rendered at the beginning so that every advisor doesn’t have to do it again. - Improve observations to include the complete view of the prompt messages, instead of only considering userText and systemText. - Remove legacy “around” advisor type concept. - Keep backward compatibility for AdvisedRequest, AdvisedResponse, and legacy Advisor APIs. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
The first issue with the current/old Advisor API I encountered was unavailable context information for the user (RAG document names), needed custom or extra advisor. Users need more control and less encapsulation with advisors. |
@ThomasVitale @tzolov What do you guys think? Can we achieve full control over Use case for this could be giving UI/frontend hints about what AI chatbot/agent is doing.. Then another use case is to sent out filenames of documents from retrieved RAG chunks. |
- Introduce “ChatClientRequest” and “ChatClientResponse” for propagating requests/responses in a ChatClient advisor chain. - Structure a Prompt at the beginning of the chain, to ensure a consistent view across execution chain and observations. Any template is rendered at the beginning so that every advisor doesn’t have to do it again. - Improve observations to include the complete view of the prompt messages, instead of only considering userText and systemText. - Remove legacy “around” advisor type concept. - Keep backward compatibility for AdvisedRequest, AdvisedResponse, and legacy Advisor APIs. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce “ChatClientRequest” and “ChatClientResponse” for propagating requests/responses in a ChatClient advisor chain. - Structure a Prompt at the beginning of the chain, to ensure a consistent view across execution chain and observations. Any template is rendered at the beginning so that every advisor doesn’t have to do it again. - Improve observations to include the complete view of the prompt messages, instead of only considering userText and systemText. - Remove legacy “around” advisor type concept. - Keep backward compatibility for AdvisedRequest, AdvisedResponse, and legacy Advisor APIs. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce “ChatClientRequest” and “ChatClientResponse” for propagating requests/responses in a ChatClient advisor chain. - Structure a Prompt at the beginning of the chain, to ensure a consistent view across execution chain and observations. Any template is rendered at the beginning so that every advisor doesn’t have to do it again. - Improve observations to include the complete view of the prompt messages, instead of only considering userText and systemText. - Remove legacy “around” advisor type concept. - Keep backward compatibility for AdvisedRequest, AdvisedResponse, and legacy Advisor APIs. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce “ChatClientRequest” and “ChatClientResponse” for propagating requests/responses in a ChatClient advisor chain. - Structure a Prompt at the beginning of the chain, to ensure a consistent view across execution chain and observations. Any template is rendered at the beginning so that every advisor doesn’t have to do it again. - Improve observations to include the complete view of the prompt messages, instead of only considering userText and systemText. - Remove legacy “around” advisor type concept. - Keep backward compatibility for AdvisedRequest, AdvisedResponse, and legacy Advisor APIs. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce “ChatClientRequest” and “ChatClientResponse” for propagating requests/responses in a ChatClient advisor chain. - Structure a Prompt at the beginning of the chain, to ensure a consistent view across execution chain and observations. Any template is rendered at the beginning so that every advisor doesn’t have to do it again. - Improve observations to include the complete view of the prompt messages, instead of only considering userText and systemText. - Remove legacy “around” advisor type concept. - Keep backward compatibility for AdvisedRequest, AdvisedResponse, and legacy Advisor APIs. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce new TemplateRenderer API providing the logic for rendering an input template. - Update the PromptTemplate API to accept a TemplateRenderer object at construction time. - Move ST logic to StTemplateRenderer implementation, used by default in PromptTemplate. - Extend ChatClient API to support passing a custom TemplateRenderer. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce new TemplateRenderer API providing the logic for rendering an input template. - Update the PromptTemplate API to accept a TemplateRenderer object at construction time. - Move ST logic to StTemplateRenderer implementation, used by default in PromptTemplate. - Extend ChatClient API to support passing a custom TemplateRenderer. - Add integration tests showing how to customize prompts in QuestionAnswerAdvisor and RetrievalAugmentationAdvisor. - Support PromptTemplate instead of String in QuestionAnswerAdvisor. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce new TemplateRenderer API providing the logic for rendering an input template. - Update the PromptTemplate API to accept a TemplateRenderer object at construction time. - Move ST logic to StTemplateRenderer implementation, used by default in PromptTemplate. - Extend ChatClient API to support passing a custom TemplateRenderer. - Add integration tests showing how to customize prompts in QuestionAnswerAdvisor and RetrievalAugmentationAdvisor. - Support PromptTemplate instead of String in QuestionAnswerAdvisor. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce new TemplateRenderer API providing the logic for rendering an input template. - Update the PromptTemplate API to accept a TemplateRenderer object at construction time. - Move ST logic to StTemplateRenderer implementation, used by default in PromptTemplate. Additionally, make start and end delimiter character configurable. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce new TemplateRenderer API providing the logic for rendering an input template. - Update the PromptTemplate API to accept a TemplateRenderer object at construction time. - Move ST logic to StTemplateRenderer implementation, used by default in PromptTemplate. Additionally, make start and end delimiter character configurable. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce “ChatClientRequest” and “ChatClientResponse” for propagating requests/responses in a ChatClient advisor chain. - Structure a Prompt at the beginning of the chain, to ensure a consistent view across execution chain and observations. Any template is rendered at the beginning so that every advisor doesn’t have to do it again. - Improve observations to include the complete view of the prompt messages, instead of only considering userText and systemText. - Remove legacy “around” advisor type concept. - Keep backward compatibility for AdvisedRequest, AdvisedResponse, and legacy Advisor APIs. Relates to gh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce new TemplateRenderer API providing the logic for rendering an input template. - Update the PromptTemplate API to accept a TemplateRenderer object at construction time. - Move ST logic to StTemplateRenderer implementation, used by default in PromptTemplate. Additionally, make start and end delimiter character configurable. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce new TemplateRenderer API providing the logic for rendering an input template. - Update the PromptTemplate API to accept a TemplateRenderer object at construction time. - Move ST logic to StTemplateRenderer implementation, used by default in PromptTemplate. Additionally, make start and end delimiter character configurable. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce new TemplateRenderer API providing the logic for rendering an input template. - Update the PromptTemplate API to accept a TemplateRenderer object at construction time. - Move ST logic to StTemplateRenderer implementation, used by default in PromptTemplate. Additionally, make start and end delimiter character configurable. Relates to spring-projectsgh-2655 Signed-off-by: Thomas Vitale <[email protected]>
- Introduce new TemplateRenderer API providing the logic for rendering an input template. - Update the PromptTemplate API to accept a TemplateRenderer object at construction time. - Move ST logic to StTemplateRenderer implementation, used by default in PromptTemplate. Additionally, make start and end delimiter character configurable. Relates to gh-2655 Signed-off-by: Thomas Vitale <[email protected]>
Advisor API
The Advisor API went through some changes over time. Initially, there was a concept of "advisor type", with separate interfaces for "before", "after", and "around" advisors. Currently, there is only one type of advisor (around). Still, the naming of many APIs and the observation context include the legacy "around" concept. We should probably get rid of that to avoid confusion.
AdvisedRequest
The
AdvisedRequest
API contains the same information of aPrompt
, but in a de-structured format. Furthermore, for the same information, multiple ways exist to store it. This causes lots of troubles in all consumers of anAdvisedRequest
object (e.g. advisors and observations), in particular:userParams
have already been used to render a version of theuserText
. In that case, the workaround is for each consumer to render theuserText
, even if not needed. Some issues registered about this: Parameters not saved in ChatMemory when using PromptUserSpec #2618userText
,systemText
) or in the more genericmessages
. In that case, the consumer must come up with a strategy to infer the actual user message and the actual system message, since there can be multiple ones. Furthermore, metadata for user and system messages are lost if passed viamessages
, because they are then converted internally touserText
andsystemText
. Some issues registered about this: MessageChatMemoryAdvisor results in an error if prompt is initialized with List<Message> instead of userText #2339, Metadata Loss in AdvisedRequest.toPrompt() (Spring AI v1.0.0-M6) #2355, AdvisedRequest.userText is not populated under some ChatClient Configurations. #2408, Metadata Loss in DefaultChatClient.DefaultStreamResponseSpec When Converting UserMessage to AdvisedRequest #2612, spring.ai.chat.client.*.text span tags are not captured when using messages() #2631, MessageChatMemoryAdvisor and PromptChatMemoryAdvisor lack a way to pass custom Message metadata #2437, Relax text validation to support empty strings in multimodal scenarios #2284, UserMessage lost userParams after being processed by MessageChatMemoryAdvisor #2701toolNames
) to in the more genericchatOptions
. When both places containtoolNames
, how should possible conflicts be handled? In some cases, the tools passed viachatOptions
are completely overwritten by the ones passed via the dedicated APIs. Some issues registered about this: What's the difference between passing tools to the Prompt vs the ChatClient? #2530systemText
and a SystemMessage inmessages
are provided, it's not clear which one is used in the call to the chat model. Some issues registered about this: [ChatClient] Inconsistent handling of system messages #873, The Default System Prompt is not the first element in the messages array #2216, MistralAiChatModel returns an error if MessageChatMemoryAdvisor is used #2380Furthermore, an
AdvisedRequest
contains a mutableChatModel
instance which is very confusing. Consumers of anAdvisedRequest
have the power to replace theChatModel
instance, but that operation would have no effect. Actually, it could lead to wrong information provided downstream. If a subsequent consumer relies on that field to know which model is used by the chain of advisors, it could receive wrong information. It should probably be removed.Similarly, an
AdvisedRequest
contains a mutableList<Advisor>
which includes the list of advisors in the chain. Consumers of anAdvisedRequest
have the power to replace the manipulate theList<Advisor>
instance, but that operation would have no effect. Actually, it could lead to wrong information provided downstream. If a subsequent consumer relies on that field to know which advisors are being executed in the chain, it could receive wrong information. It should probably be removed.Finally, an
AdvisedRequest
contains bothadvisorParams
andadviseContext
fields. However, the first one is only used at the beginning of an advisor chain to populate the second field, and never used anymore. It should probably be removed to avoid confusion and unpredictable behaviour.AdvisedResponse
The
AdvisedResponse
API carries both an optional ChatResponse object returned from a chat model call and the context used throughout the chat client execution, including the advisor chain. This context includes useful information. For example, when performing a RAG operation, it contains the contextual documents retrieved by the vector store and used by the model for answering the user question. This context is currently hidden within the advisor chain, causing issues to users that would like to use that contextual information for several reasons, including for evaluation and validation. There have been some suggestions for expanding the ChatResponse so to include this extra context, but that solution doesn't really scale and dirties the lower-level Chat Model API. Instead, a better solution would be to return the context to the caller in a clean way.Some issues registered about this: #1747
Observation
The observation context for the ChatClient operations is affected by the problems mentioned above regarding the
AdvisedRequest
. In particular, the information included in the observation context might be wrong or incomplete. For example, user messages, system messages, and tools are likely to be incomplete since we only consider one of the possible locations in theAdvisedRequest
where they could be stored.Also, the
ChatClientObservationContext
is strictly dependent on theDefaultChatClientRequestSpec
, meaning that it only works with the default implementation of theChatClient
API.Prompt Templating
The ChatClient API supports passing
userParams
andsystemParams
, which can be used to render the templates passed asuserText
andsystemText
. At a minimum, the templates are rendered right before calling the chat model, using a defaultPromptTemplate
object. When using advisors, the rendering might be needed earlier. For example, it had to be included inQuestionAnswerAdvisor
andRetrievalAugmentationAdvisor
. That's part of the issues mentioned earlier in the context ofAdvisedRequest
.The templating strategy is based on StringTemplate and cannot be changed. This creates problems in two main cases:
The ChatClient API should provide the possibility to customise the prompt template rendering strategy.
Some issues registered about this: #355, #1687, #2448, #2456, #1849, #1428, #2468, #2020
The text was updated successfully, but these errors were encountered: