Skip to content

Commit 9ada3aa

Browse files
westey-mSergeyMenshykhrogerbarreto
authored
.Net: Add AIContextProvider support to Semantic Kernel (#11689)
### Motivation and Context Adding support for AI context behaviors to Semantic Kernel. These allow creating plugins that are able to listen to messages being added to the chat history and contribute to the AI context on each invocation. This PR also integrates support for these with Agents and AgentThreads. #10100 #10712 --------- Co-authored-by: SergeyMenshykh <[email protected]> Co-authored-by: Roger Barreto <[email protected]>
1 parent 7243188 commit 9ada3aa

File tree

68 files changed

+4821
-55
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+4821
-55
lines changed
Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
---
2+
# These are optional elements. Feel free to remove any of them.
3+
status: accepted
4+
contact: westey-m
5+
date: 2025-04-17
6+
deciders: westey-m, markwallace-microsoft, alliscode, TaoChenOSU, moonbox3, crickman
7+
consulted: westey-m, markwallace-microsoft, alliscode, TaoChenOSU, moonbox3, crickman
8+
informed: westey-m, markwallace-microsoft, alliscode, TaoChenOSU, moonbox3, crickman
9+
---
10+
11+
# Agents with Memory
12+
13+
## What do we mean by Memory?
14+
15+
By memory we mean the capability to remember information and skills that are learned during
16+
a conversation and re-use those later in the same conversation or later in a subsequent conversation.
17+
18+
## Context and Problem Statement
19+
20+
Today we support multiple agent types with different characteristics:
21+
22+
1. In process vs remote.
23+
2. Remote agents that store and maintain conversation state in the service vs those that require the caller to provide conversation state on each invocation.
24+
25+
We need to support advanced memory capabilities across this range of agent types.
26+
27+
### Memory Scope
28+
29+
Another aspect of memory that is important to consider is the scope of different memory types.
30+
Most agent implementations have instructions and skills but the agent is not tied to a single conversation.
31+
On each invocation of the agent, the agent is told which conversation to participate in, during that invocation.
32+
33+
Memories about a user or about a conversation with a user is therefore extracted from one of these conversation and recalled
34+
during the same or another conversation with the same user.
35+
These memories will typically contain information that the user would not like to share with other users of the system.
36+
37+
Other types of memories also exist which are not tied to a specific user or conversation.
38+
E.g. an Agent may learn how to do something and be able to do that in many conversations with different users.
39+
With these type of memories there is of cousrse risk in leaking personal information between different users which is important to guard against.
40+
41+
### Packaging memory capabilities
42+
43+
All of the above memory types can be supported for any agent by attaching software components to conversation threads.
44+
This is achieved via a simple mechanism of:
45+
46+
1. Inspecting and using messages as they are passed to and from the agent.
47+
2. Passing additional context to the agent per invocation.
48+
49+
With our current `AgentThread` implementation, when an agent is invoked, all input and output messages are already passed to the `AgentThread`
50+
and can be made available to any components attached to the `AgentThread`.
51+
Where agents are remote/external and manage conversation state in the service, passing the messages to the `AgentThread` may not have any
52+
affect on the thread in the service. This is OK, since the service will have already updated the thread during the remote invocation.
53+
It does however, still allow us to subscribe to messages in any attached components.
54+
55+
For the second requirement of getting additional context per invocation, the agent may ask the thread passed to it, to in turn ask
56+
each of the components attached to it, to provide context to pass to the Agent.
57+
This enables the component to provide memories that it contains to the Agent as needed.
58+
59+
Different memory capabilities can be built using separate components. Each component would have the following characteristics:
60+
61+
1. May store some context that can be provided to the agent per invocation.
62+
2. May inspect messages from the conversation to learn from the conversation and build its context.
63+
3. May register plugins to allow the agent to directly store, retrieve, update or clear memories.
64+
65+
### Suspend / Resume
66+
67+
Building a service to host an agent comes with challenges.
68+
It's hard to build a stateful service, but service consumers expect an experience that looks stateful from the outside.
69+
E.g. on each invocation, the user expects that the service can continue a conversation they are having.
70+
71+
This means that where the the service is exposing a local agent with local conversation state management (e.g. via `ChatHistory`)
72+
that conversation state needs to be loaded and persisted for each invocation of the service.
73+
74+
It also means that any memory components that may have some in-memory state will need to be loaded and persisted too.
75+
76+
For cases like this, the `OnSuspend` and `OnResume` methods allow notification of the components that they need to save or reload their state.
77+
It is up to each of these components to decide how and where to save state to or load state from.
78+
79+
## Proposed interface for Memory Components
80+
81+
The types of events that Memory Components require are not unique to memory, and can be used to package up other capabilities too.
82+
The suggestion is therefore to create a more generally named type that can be used for other scenarios as well and can even
83+
be used for non-agent scenarios too.
84+
85+
This type should live in the `Microsoft.SemanticKernel.Abstractions` nuget, since these components can be used by systems other than just agents.
86+
87+
```csharp
88+
namespace Microsoft.SemanticKernel;
89+
90+
public abstract class AIContextBehavior
91+
{
92+
public virtual IReadOnlyCollection<AIFunction> AIFunctions => Array.Empty<AIFunction>();
93+
94+
public virtual Task OnThreadCreatedAsync(string? threadId, CancellationToken cancellationToken = default);
95+
public virtual Task OnThreadDeleteAsync(string? threadId, CancellationToken cancellationToken = default);
96+
97+
// OnThreadCheckpointAsync not included in initial release, maybe in future.
98+
public virtual Task OnThreadCheckpointAsync(string? threadId, CancellationToken cancellationToken = default);
99+
100+
public virtual Task OnNewMessageAsync(string? threadId, ChatMessage newMessage, CancellationToken cancellationToken = default);
101+
public abstract Task<string> OnModelInvokeAsync(ICollection<ChatMessage> newMessages, CancellationToken cancellationToken = default);
102+
103+
public virtual Task OnSuspendAsync(string? threadId, CancellationToken cancellationToken = default);
104+
public virtual Task OnResumeAsync(string? threadId, CancellationToken cancellationToken = default);
105+
}
106+
```
107+
108+
## Managing multiple components
109+
110+
To manage multiple components I propose that we have a `AIContextBehavior`.
111+
This class allows registering components and delegating new message notifications, ai invocation calls, etc. to the contained components.
112+
113+
## Integrating with agents
114+
115+
I propose to add a `AIContextBehaviorManager` to the `AgentThread` class, allowing us to attach components to any `AgentThread`.
116+
117+
When an `Agent` is invoked, we will call `OnModelInvokeAsync` on each component via the `AIContextBehaviorManager` to get
118+
a combined set of context to pass to the agent for this invocation. This will be internal to the `Agent` class and transparent to the user.
119+
120+
```csharp
121+
var additionalInstructions = await currentAgentThread.OnModelInvokeAsync(messages, cancellationToken).ConfigureAwait(false);
122+
```
123+
124+
## Usage examples
125+
126+
### Multiple threads using the same memory component
127+
128+
```csharp
129+
// Create a vector store for storing memories.
130+
var vectorStore = new InMemoryVectorStore();
131+
// Create a memory store that is tired to a "Memories" collection in the vector store and stores memories under the "user/12345" namespace.
132+
using var textMemoryStore = new VectorDataTextMemoryStore<string>(vectorStore, textEmbeddingService, "Memories", "user/12345", 1536);
133+
134+
// Create a memory component to will pull user facts from the conversation, store them in the vector store
135+
// and pass them to the agent as additional instructions.
136+
var userFacts = new UserFactsMemoryComponent(this.Fixture.Agent.Kernel, textMemoryStore);
137+
138+
// Create a thread and attach a Memory Component.
139+
var agentThread1 = new ChatHistoryAgentThread();
140+
agentThread1.ThreadExtensionsManager.Add(userFacts);
141+
var asyncResults1 = agent.InvokeAsync("Hello, my name is Caoimhe.", agentThread1);
142+
143+
// Create a second thread and attach a Memory Component.
144+
var agentThread2 = new ChatHistoryAgentThread();
145+
agentThread2.ThreadExtensionsManager.Add(userFacts);
146+
var asyncResults2 = agent.InvokeAsync("What is my name?.", agentThread2);
147+
// Expected response contains Caoimhe.
148+
```
149+
150+
### Using a RAG component
151+
152+
```csharp
153+
// Create Vector Store and Rag Store/Component
154+
var vectorStore = new InMemoryVectorStore();
155+
using var ragStore = new TextRagStore<string>(vectorStore, textEmbeddingService, "Memories", 1536, "group/g2");
156+
var ragComponent = new TextRagComponent(ragStore, new TextRagComponentOptions());
157+
158+
// Upsert docs into vector store.
159+
await ragStore.UpsertDocumentsAsync(
160+
[
161+
new TextRagDocument("The financial results of Contoso Corp for 2023 is as follows:\nIncome EUR 174 000 000\nExpenses EUR 152 000 000")
162+
{
163+
SourceName = "Contoso 2023 Financial Report",
164+
SourceReference = "https://www.consoso.com/reports/2023.pdf",
165+
Namespaces = ["group/g2"]
166+
}
167+
]);
168+
169+
// Create a new agent thread and register the Rag component
170+
var agentThread = new ChatHistoryAgentThread();
171+
agentThread.ThreadExtensionsManager.RegisterThreadExtension(ragComponent);
172+
173+
// Inovke the agent.
174+
var asyncResults1 = agent.InvokeAsync("What was the income of Contoso for 2023", agentThread);
175+
// Expected response contains the 174M income from the document.
176+
```
177+
178+
## Decisions to make
179+
180+
### Extension base class name
181+
182+
1. ConversationStateExtension
183+
184+
1.1. Long
185+
186+
2. MemoryComponent
187+
188+
2.1. Too specific
189+
190+
3. AIContextBehavior
191+
192+
Decided 3. AIContextBehavior.
193+
194+
### Location for abstractions
195+
196+
1. Microsoft.SemanticKernel.<baseclass>
197+
2. Microsoft.SemanticKernel.Memory.<baseclass>
198+
3. Microsoft.SemanticKernel.Memory.<baseclass> (in separate nuget)
199+
200+
Decided: 1. Microsoft.SemanticKernel.<baseclass>.
201+
202+
### Location for memory components
203+
204+
1. A nuget for each component
205+
2. Microsoft.SemanticKernel.Core nuget
206+
3. Microsoft.SemanticKernel.Memory nuget
207+
4. Microsoft.SemanticKernel.ConversationStateExtensions nuget
208+
209+
Decided: 2. Microsoft.SemanticKernel.Core nuget

dotnet/docs/EXPERIMENTS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ You can use the following diagnostic IDs to ignore warnings or errors for a part
2525
| SKEXP0100 | Advanced Semantic Kernel features |
2626
| SKEXP0110 | Semantic Kernel Agents |
2727
| SKEXP0120 | Native-AOT |
28+
| SKEXP0130 | AI Context Providers |
2829
| MEVD9000 | Microsoft.Extensions.VectorData experimental user-facing APIs |
2930
| MEVD9001 | Microsoft.Extensions.VectorData experimental connector-facing APIs |
3031

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
// Copyright (c) Microsoft. All rights reserved.
2+
3+
using System.Net.Http.Headers;
4+
using Microsoft.SemanticKernel;
5+
using Microsoft.SemanticKernel.Agents;
6+
using Microsoft.SemanticKernel.Memory;
7+
8+
namespace Agents;
9+
10+
#pragma warning disable SKEXP0130 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed.
11+
12+
/// <summary>
13+
/// Demonstrate creation of <see cref="ChatCompletionAgent"/> and
14+
/// adding memory capabilities to it using https://mem0.ai/.
15+
/// </summary>
16+
public class ChatCompletion_Mem0(ITestOutputHelper output) : BaseTest(output)
17+
{
18+
private const string AgentName = "FriendlyAssistant";
19+
private const string AgentInstructions = "You are a friendly assistant";
20+
21+
/// <summary>
22+
/// Shows how to allow an agent to remember user preferences across multiple threads.
23+
/// </summary>
24+
[Fact]
25+
private async Task UseMemoryAsync()
26+
{
27+
// Create a new HttpClient with the base address of the mem0 service and a token for authentication.
28+
using var httpClient = new HttpClient()
29+
{
30+
BaseAddress = new Uri(TestConfiguration.Mem0.BaseAddress ?? "https://api.mem0.ai")
31+
};
32+
httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Token", TestConfiguration.Mem0.ApiKey);
33+
34+
// Create a mem0 component with the current user's id, so that it stores memories for that user.
35+
var mem0Provider = new Mem0Provider(httpClient, new()
36+
{
37+
UserId = "U1"
38+
});
39+
40+
// Clear out any memories from previous runs, if any, to demonstrate a first run experience.
41+
await mem0Provider.ClearStoredMemoriesAsync();
42+
43+
// Create our agent and add our finance plugin with auto function invocation.
44+
Kernel kernel = this.CreateKernelWithChatCompletion();
45+
kernel.Plugins.AddFromType<FinancePlugin>();
46+
ChatCompletionAgent agent =
47+
new()
48+
{
49+
Name = AgentName,
50+
Instructions = AgentInstructions,
51+
Kernel = kernel,
52+
Arguments = new KernelArguments(new PromptExecutionSettings { FunctionChoiceBehavior = FunctionChoiceBehavior.Auto() })
53+
};
54+
55+
Console.WriteLine("----- First Conversation -----");
56+
57+
// Create a thread for the agent and add the mem0 component to it.
58+
ChatHistoryAgentThread agentThread = new();
59+
agentThread.AIContextProviders.Add(mem0Provider);
60+
61+
// First ask the agent to retrieve a company report with no previous context.
62+
// The agent will not be able to invoke the plugin, since it doesn't know
63+
// the company code or the report format, so it should ask for clarification.
64+
string userMessage = "Please retrieve my company report";
65+
Console.WriteLine($"User: {userMessage}");
66+
67+
ChatMessageContent message = await agent.InvokeAsync(userMessage, agentThread).FirstAsync();
68+
Console.WriteLine($"Assistant:\n{message.Content}");
69+
70+
// Now tell the agent the company code and the report format that you want to use
71+
// and it should be able to invoke the plugin and return the report.
72+
userMessage = "I always work with CNTS and I always want a detailed report format";
73+
Console.WriteLine($"User: {userMessage}");
74+
75+
message = await agent.InvokeAsync(userMessage, agentThread).FirstAsync();
76+
Console.WriteLine($"Assistant:\n{message.Content}");
77+
78+
Console.WriteLine("----- Second Conversation -----");
79+
80+
// Create a new thread for the agent and add our mem0 component to it again
81+
// The new thread has no context of the previous conversation.
82+
agentThread = new();
83+
agentThread.AIContextProviders.Add(mem0Provider);
84+
85+
// Since we have the mem0 component in the thread, the agent should be able to
86+
// retrieve the company report without asking for clarification, as it will
87+
// be able to remember the user preferences from the last thread.
88+
userMessage = "Please retrieve my company report";
89+
Console.WriteLine($"User: {userMessage}");
90+
91+
message = await agent.InvokeAsync(userMessage, agentThread).FirstAsync();
92+
Console.WriteLine($"Assistant:\n{message.Content}");
93+
}
94+
95+
private sealed class FinancePlugin
96+
{
97+
[KernelFunction]
98+
public string RetrieveCompanyReport(string companyCode, ReportFormat reportFormat)
99+
{
100+
if (companyCode != "CNTS")
101+
{
102+
throw new ArgumentException("Company code not found");
103+
}
104+
105+
return reportFormat switch
106+
{
107+
ReportFormat.Brief => "CNTS is a company that specializes in technology.",
108+
ReportFormat.Detailed => "CNTS is a company that specializes in technology. It had a revenue of $10 million in 2022. It has 100 employees.",
109+
_ => throw new ArgumentException("Report format not found")
110+
};
111+
}
112+
}
113+
114+
private enum ReportFormat
115+
{
116+
Brief,
117+
Detailed
118+
}
119+
}

0 commit comments

Comments
 (0)