|
| 1 | +--- |
| 2 | +# These are optional elements. Feel free to remove any of them. |
| 3 | +status: accepted |
| 4 | +contact: westey-m |
| 5 | +date: 2025-04-17 |
| 6 | +deciders: westey-m, markwallace-microsoft, alliscode, TaoChenOSU, moonbox3, crickman |
| 7 | +consulted: westey-m, markwallace-microsoft, alliscode, TaoChenOSU, moonbox3, crickman |
| 8 | +informed: westey-m, markwallace-microsoft, alliscode, TaoChenOSU, moonbox3, crickman |
| 9 | +--- |
| 10 | + |
| 11 | +# Agents with Memory |
| 12 | + |
| 13 | +## What do we mean by Memory? |
| 14 | + |
| 15 | +By memory we mean the capability to remember information and skills that are learned during |
| 16 | +a conversation and re-use those later in the same conversation or later in a subsequent conversation. |
| 17 | + |
| 18 | +## Context and Problem Statement |
| 19 | + |
| 20 | +Today we support multiple agent types with different characteristics: |
| 21 | + |
| 22 | +1. In process vs remote. |
| 23 | +2. Remote agents that store and maintain conversation state in the service vs those that require the caller to provide conversation state on each invocation. |
| 24 | + |
| 25 | +We need to support advanced memory capabilities across this range of agent types. |
| 26 | + |
| 27 | +### Memory Scope |
| 28 | + |
| 29 | +Another aspect of memory that is important to consider is the scope of different memory types. |
| 30 | +Most agent implementations have instructions and skills but the agent is not tied to a single conversation. |
| 31 | +On each invocation of the agent, the agent is told which conversation to participate in, during that invocation. |
| 32 | + |
| 33 | +Memories about a user or about a conversation with a user is therefore extracted from one of these conversation and recalled |
| 34 | +during the same or another conversation with the same user. |
| 35 | +These memories will typically contain information that the user would not like to share with other users of the system. |
| 36 | + |
| 37 | +Other types of memories also exist which are not tied to a specific user or conversation. |
| 38 | +E.g. an Agent may learn how to do something and be able to do that in many conversations with different users. |
| 39 | +With these type of memories there is of cousrse risk in leaking personal information between different users which is important to guard against. |
| 40 | + |
| 41 | +### Packaging memory capabilities |
| 42 | + |
| 43 | +All of the above memory types can be supported for any agent by attaching software components to conversation threads. |
| 44 | +This is achieved via a simple mechanism of: |
| 45 | + |
| 46 | +1. Inspecting and using messages as they are passed to and from the agent. |
| 47 | +2. Passing additional context to the agent per invocation. |
| 48 | + |
| 49 | +With our current `AgentThread` implementation, when an agent is invoked, all input and output messages are already passed to the `AgentThread` |
| 50 | +and can be made available to any components attached to the `AgentThread`. |
| 51 | +Where agents are remote/external and manage conversation state in the service, passing the messages to the `AgentThread` may not have any |
| 52 | +affect on the thread in the service. This is OK, since the service will have already updated the thread during the remote invocation. |
| 53 | +It does however, still allow us to subscribe to messages in any attached components. |
| 54 | + |
| 55 | +For the second requirement of getting additional context per invocation, the agent may ask the thread passed to it, to in turn ask |
| 56 | +each of the components attached to it, to provide context to pass to the Agent. |
| 57 | +This enables the component to provide memories that it contains to the Agent as needed. |
| 58 | + |
| 59 | +Different memory capabilities can be built using separate components. Each component would have the following characteristics: |
| 60 | + |
| 61 | +1. May store some context that can be provided to the agent per invocation. |
| 62 | +2. May inspect messages from the conversation to learn from the conversation and build its context. |
| 63 | +3. May register plugins to allow the agent to directly store, retrieve, update or clear memories. |
| 64 | + |
| 65 | +### Suspend / Resume |
| 66 | + |
| 67 | +Building a service to host an agent comes with challenges. |
| 68 | +It's hard to build a stateful service, but service consumers expect an experience that looks stateful from the outside. |
| 69 | +E.g. on each invocation, the user expects that the service can continue a conversation they are having. |
| 70 | + |
| 71 | +This means that where the the service is exposing a local agent with local conversation state management (e.g. via `ChatHistory`) |
| 72 | +that conversation state needs to be loaded and persisted for each invocation of the service. |
| 73 | + |
| 74 | +It also means that any memory components that may have some in-memory state will need to be loaded and persisted too. |
| 75 | + |
| 76 | +For cases like this, the `OnSuspend` and `OnResume` methods allow notification of the components that they need to save or reload their state. |
| 77 | +It is up to each of these components to decide how and where to save state to or load state from. |
| 78 | + |
| 79 | +## Proposed interface for Memory Components |
| 80 | + |
| 81 | +The types of events that Memory Components require are not unique to memory, and can be used to package up other capabilities too. |
| 82 | +The suggestion is therefore to create a more generally named type that can be used for other scenarios as well and can even |
| 83 | +be used for non-agent scenarios too. |
| 84 | + |
| 85 | +This type should live in the `Microsoft.SemanticKernel.Abstractions` nuget, since these components can be used by systems other than just agents. |
| 86 | + |
| 87 | +```csharp |
| 88 | +namespace Microsoft.SemanticKernel; |
| 89 | + |
| 90 | +public abstract class AIContextBehavior |
| 91 | +{ |
| 92 | + public virtual IReadOnlyCollection<AIFunction> AIFunctions => Array.Empty<AIFunction>(); |
| 93 | + |
| 94 | + public virtual Task OnThreadCreatedAsync(string? threadId, CancellationToken cancellationToken = default); |
| 95 | + public virtual Task OnThreadDeleteAsync(string? threadId, CancellationToken cancellationToken = default); |
| 96 | + |
| 97 | + // OnThreadCheckpointAsync not included in initial release, maybe in future. |
| 98 | + public virtual Task OnThreadCheckpointAsync(string? threadId, CancellationToken cancellationToken = default); |
| 99 | + |
| 100 | + public virtual Task OnNewMessageAsync(string? threadId, ChatMessage newMessage, CancellationToken cancellationToken = default); |
| 101 | + public abstract Task<string> OnModelInvokeAsync(ICollection<ChatMessage> newMessages, CancellationToken cancellationToken = default); |
| 102 | + |
| 103 | + public virtual Task OnSuspendAsync(string? threadId, CancellationToken cancellationToken = default); |
| 104 | + public virtual Task OnResumeAsync(string? threadId, CancellationToken cancellationToken = default); |
| 105 | +} |
| 106 | +``` |
| 107 | + |
| 108 | +## Managing multiple components |
| 109 | + |
| 110 | +To manage multiple components I propose that we have a `AIContextBehavior`. |
| 111 | +This class allows registering components and delegating new message notifications, ai invocation calls, etc. to the contained components. |
| 112 | + |
| 113 | +## Integrating with agents |
| 114 | + |
| 115 | +I propose to add a `AIContextBehaviorManager` to the `AgentThread` class, allowing us to attach components to any `AgentThread`. |
| 116 | + |
| 117 | +When an `Agent` is invoked, we will call `OnModelInvokeAsync` on each component via the `AIContextBehaviorManager` to get |
| 118 | +a combined set of context to pass to the agent for this invocation. This will be internal to the `Agent` class and transparent to the user. |
| 119 | + |
| 120 | +```csharp |
| 121 | +var additionalInstructions = await currentAgentThread.OnModelInvokeAsync(messages, cancellationToken).ConfigureAwait(false); |
| 122 | +``` |
| 123 | + |
| 124 | +## Usage examples |
| 125 | + |
| 126 | +### Multiple threads using the same memory component |
| 127 | + |
| 128 | +```csharp |
| 129 | +// Create a vector store for storing memories. |
| 130 | +var vectorStore = new InMemoryVectorStore(); |
| 131 | +// Create a memory store that is tired to a "Memories" collection in the vector store and stores memories under the "user/12345" namespace. |
| 132 | +using var textMemoryStore = new VectorDataTextMemoryStore<string>(vectorStore, textEmbeddingService, "Memories", "user/12345", 1536); |
| 133 | + |
| 134 | +// Create a memory component to will pull user facts from the conversation, store them in the vector store |
| 135 | +// and pass them to the agent as additional instructions. |
| 136 | +var userFacts = new UserFactsMemoryComponent(this.Fixture.Agent.Kernel, textMemoryStore); |
| 137 | + |
| 138 | +// Create a thread and attach a Memory Component. |
| 139 | +var agentThread1 = new ChatHistoryAgentThread(); |
| 140 | +agentThread1.ThreadExtensionsManager.Add(userFacts); |
| 141 | +var asyncResults1 = agent.InvokeAsync("Hello, my name is Caoimhe.", agentThread1); |
| 142 | + |
| 143 | +// Create a second thread and attach a Memory Component. |
| 144 | +var agentThread2 = new ChatHistoryAgentThread(); |
| 145 | +agentThread2.ThreadExtensionsManager.Add(userFacts); |
| 146 | +var asyncResults2 = agent.InvokeAsync("What is my name?.", agentThread2); |
| 147 | +// Expected response contains Caoimhe. |
| 148 | +``` |
| 149 | + |
| 150 | +### Using a RAG component |
| 151 | + |
| 152 | +```csharp |
| 153 | +// Create Vector Store and Rag Store/Component |
| 154 | +var vectorStore = new InMemoryVectorStore(); |
| 155 | +using var ragStore = new TextRagStore<string>(vectorStore, textEmbeddingService, "Memories", 1536, "group/g2"); |
| 156 | +var ragComponent = new TextRagComponent(ragStore, new TextRagComponentOptions()); |
| 157 | + |
| 158 | +// Upsert docs into vector store. |
| 159 | +await ragStore.UpsertDocumentsAsync( |
| 160 | +[ |
| 161 | + new TextRagDocument("The financial results of Contoso Corp for 2023 is as follows:\nIncome EUR 174 000 000\nExpenses EUR 152 000 000") |
| 162 | + { |
| 163 | + SourceName = "Contoso 2023 Financial Report", |
| 164 | + SourceReference = "https://www.consoso.com/reports/2023.pdf", |
| 165 | + Namespaces = ["group/g2"] |
| 166 | + } |
| 167 | +]); |
| 168 | + |
| 169 | +// Create a new agent thread and register the Rag component |
| 170 | +var agentThread = new ChatHistoryAgentThread(); |
| 171 | +agentThread.ThreadExtensionsManager.RegisterThreadExtension(ragComponent); |
| 172 | + |
| 173 | +// Inovke the agent. |
| 174 | +var asyncResults1 = agent.InvokeAsync("What was the income of Contoso for 2023", agentThread); |
| 175 | +// Expected response contains the 174M income from the document. |
| 176 | +``` |
| 177 | + |
| 178 | +## Decisions to make |
| 179 | + |
| 180 | +### Extension base class name |
| 181 | + |
| 182 | +1. ConversationStateExtension |
| 183 | + |
| 184 | + 1.1. Long |
| 185 | + |
| 186 | +2. MemoryComponent |
| 187 | + |
| 188 | + 2.1. Too specific |
| 189 | + |
| 190 | +3. AIContextBehavior |
| 191 | + |
| 192 | +Decided 3. AIContextBehavior. |
| 193 | + |
| 194 | +### Location for abstractions |
| 195 | + |
| 196 | +1. Microsoft.SemanticKernel.<baseclass> |
| 197 | +2. Microsoft.SemanticKernel.Memory.<baseclass> |
| 198 | +3. Microsoft.SemanticKernel.Memory.<baseclass> (in separate nuget) |
| 199 | + |
| 200 | +Decided: 1. Microsoft.SemanticKernel.<baseclass>. |
| 201 | + |
| 202 | +### Location for memory components |
| 203 | + |
| 204 | +1. A nuget for each component |
| 205 | +2. Microsoft.SemanticKernel.Core nuget |
| 206 | +3. Microsoft.SemanticKernel.Memory nuget |
| 207 | +4. Microsoft.SemanticKernel.ConversationStateExtensions nuget |
| 208 | + |
| 209 | +Decided: 2. Microsoft.SemanticKernel.Core nuget |
0 commit comments