You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -37,12 +41,12 @@ Advisors also participate in the Observability stack, so you can view metrics an
37
41
38
42
== Core Components
39
43
40
-
The API consists of `CallAroundAdvisor` and `CallAroundAdvisorChain` for non-streaming scenarios, and `StreamAroundAdvisor` and `StreamAroundAdvisorChain` for streaming scenarios.
41
-
It also includes `AdvisedRequest` to represent the unsealed Prompt request, `AdvisedResponse` for the Chat Completion response. Both hold an `advise-context` to share state across the advisor chain.
44
+
The API consists of `CallAdvisor` and `CallAdvisorChain` for non-streaming scenarios, and `StreamAdvisor` and `StreamAdvisorChain` for streaming scenarios.
45
+
It also includes `ChatClientRequest` to represent the unsealed Prompt request, `ChatClientResponse` for the Chat Completion response. Both hold an `advise-context` to share state across the advisor chain.
42
46
43
47
image::advisors-api-classes.jpg[Advisors API Classes, width=600, align="center"]
44
48
45
-
The `nextAroundCall()` and the `nextAroundStream()` are the key advisor methods, typically performing actions such as examining the unsealed Prompt data, customizing and augmenting the Prompt data, invoking the next entity in the advisor chain, optionally blocking the request, examining the chat completion response, and throwing exceptions to indicate processing errors.
49
+
The `adviseCall()` and the `adviseStream()` are the key advisor methods, typically performing actions such as examining the unsealed Prompt data, customizing and augmenting the Prompt data, invoking the next entity in the advisor chain, optionally blocking the request, examining the chat completion response, and throwing exceptions to indicate processing errors.
46
50
47
51
In addition the `getOrder()` method determines advisor order in the chain, while `getName()` provides a unique advisor name.
48
52
@@ -52,14 +56,14 @@ The last advisor, added automatically, sends the request to the LLM.
52
56
53
57
Following flow diagram illustrates the interaction between the advisor chain and the Chat Model:
54
58
55
-
image::advisors-flow.jpg[Advisors API Flow, width=400, align="left"]
59
+
image::advisors-flow.jpg[Advisors API Flow, width=400, align="center"]
56
60
57
-
. The Spring AI framework creates an `AdvisedRequest` from user's `Prompt` along with an empty `AdvisorContext` object.
61
+
. The Spring AI framework creates an `ChatClientRequest` from user's `Prompt` along with an empty advisor `context` object.
58
62
. Each advisor in the chain processes the request, potentially modifying it. Alternatively, it can choose to block the request by not making the call to invoke the next entity. In the latter case, the advisor is responsible for filling out the response.
59
63
. The final advisor, provided by the framework, sends the request to the `Chat Model`.
60
-
. The Chat Model's response is then passed back through the advisor chain and converted into `AdvisedResponse`. Later includes the shared `AdvisorContext` instance.
64
+
. The Chat Model's response is then passed back through the advisor chain and converted into `ChatClientResponse`. Later includes the shared advisor `context` instance.
61
65
. Each advisor can process or modify the response.
62
-
. The final `AdvisedResponse` is returned to the client by extracting the `ChatCompletion`.
66
+
. The final `ChatClientResponse` is returned to the client by extracting the `ChatCompletion`.
63
67
64
68
=== Advisor Order
65
69
The execution order of advisors in the chain is determined by the `getOrder()` method. Key points to understand:
@@ -142,76 +146,86 @@ public interface Advisor extends Ordered {
142
146
The two sub-interfaces for synchronous and reactive Advisors are
143
147
144
148
```java
145
-
public interface CallAroundAdvisor extends Advisor {
149
+
public interface CallAdvisor extends Advisor {
146
150
147
-
/**
148
-
* Around advice that wraps the ChatModel#call(Prompt) method.
* Returns the list of all the {@link StreamAdvisor} instances included in this chain
204
+
* at the time of its creation.
205
+
*/
206
+
List<StreamAdvisor> getStreamAdvisors();
192
207
193
208
}
194
209
```
195
210
196
211
197
-
198
212
== Implementing an Advisor
199
213
200
-
To create an advisor, implement either `CallAroundAdvisor` or `StreamAroundAdvisor` (or both). The key method to implement is `nextAroundCall()` for non-streaming or `nextAroundStream()` for streaming advisors.
214
+
To create an advisor, implement either `CallAdvisor` or `StreamAdvisor` (or both). The key method to implement is `nextCall()` for non-streaming or `nextStream()` for streaming advisors.
201
215
202
216
=== Examples
203
217
204
218
We will provide few hands-on examples to illustrate how to implement advisors for observing and augmenting use-cases.
205
219
206
220
==== Logging Advisor
207
221
208
-
We can implement a simple logging advisor that logs the `AdvisedRequest` before and the `AdvisedResponse` after the call to the next advisor in the chain.
222
+
We can implement a simple logging advisor that logs the `ChatClientRequest` before and the `ChatClientResponse` after the call to the next advisor in the chain.
209
223
Note that the advisor only observes the request and response and does not modify them.
210
224
This implementation support both non-streaming and streaming scenarios.
211
225
212
226
[source,java]
213
227
----
214
-
public class SimpleLoggerAdvisor implements CallAroundAdvisor, StreamAroundAdvisor {
228
+
public class SimpleLoggerAdvisor implements CallAdvisor, StreamAdvisor {
215
229
216
230
private static final Logger logger = LoggerFactory.getLogger(SimpleLoggerAdvisor.class);
217
231
@@ -225,33 +239,41 @@ public class SimpleLoggerAdvisor implements CallAroundAdvisor, StreamAroundAdvis
225
239
return 0;
226
240
}
227
241
228
-
@Override
229
-
public AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {
230
242
231
-
logger.debug("BEFORE: {}", advisedRequest);
243
+
@Override
244
+
public ChatClientResponse adviseCall(ChatClientRequest chatClientRequest, CallAdvisorChain callAdvisorChain) {
public ChatClientResponse after(ChatClientResponse chatClientResponse, AdvisorChain advisorChain) {
329
+
return chatClientResponse;
297
330
}
298
331
299
332
@Override
300
-
public int getOrder() { // <4>
301
-
return 0;
333
+
public int getOrder() { // <2>
334
+
return this.order;
302
335
}
303
336
304
-
@Override
305
-
public String getName() { // <5>
306
-
return this.getClass().getSimpleName();
337
+
public ReReadingAdvisor withOrder(int order) {
338
+
this.order = order;
339
+
return this;
307
340
}
341
+
308
342
}
309
343
----
310
344
<1> The `before` method augments the user's input query applying the Re-Reading technique.
311
-
<2> The `aroundCall` method intercepts the non-streaming request and applies the Re-Reading technique.
312
-
<3> The `aroundStream` method intercepts the streaming request and applies the Re-Reading technique.
313
-
<4> You can control the order of execution by setting the order value. Lower values execute first.
314
-
<5> Provides a unique name for the advisor.
345
+
<2> You can control the order of execution by setting the order value. Lower values execute first.
346
+
315
347
316
348
==== Spring AI Built-in Advisors
317
349
@@ -335,7 +367,19 @@ Retrieves memory from a VectorStore and adds it into the prompt's system text. T
335
367
===== Question Answering Advisor
336
368
* `QuestionAnswerAdvisor`
337
369
+
338
-
This advisor uses a vector store to provide question-answering capabilities, implementing the RAG (Retrieval-Augmented Generation) pattern.
370
+
This advisor uses a vector store to provide question-answering capabilities, implementing the Naive RAG (Retrieval-Augmented Generation) pattern.
371
+
372
+
* `RetrievalAugmentationAdvisor`
373
+
+
374
+
Advisor that implements common Retrieval Augmented Generation (RAG) flows using the building blocks defined in the `org.springframework.ai.rag` package and following the Modular RAG Architecture.
375
+
376
+
377
+
===== Reasoning Advisor
378
+
* `ReReadingAdvisor`
379
+
+
380
+
Implements a re-reading strategy for LLM reasoning, dubbed RE2, to enhance understanding in the input phase.
381
+
Based on the article: [Re-Reading Improves Reasoning in LLMs](https://arxiv.org/pdf/2309.06275).
382
+
339
383
340
384
===== Content Safety Advisor
341
385
* `SafeGuardAdvisor`
@@ -345,7 +389,7 @@ A simple advisor designed to prevent the model from generating harmful or inappr
345
389
346
390
=== Streaming vs Non-Streaming
347
391
348
-
image::advisors-non-stream-vs-stream.jpg[Advisors Streaming vs Non-Streaming Flow, width=800, align="left"]
392
+
image::advisors-non-stream-vs-stream.jpg[Advisors Streaming vs Non-Streaming Flow, width=800, align="center"]
349
393
350
394
* Non-streaming advisors work with complete requests and responses.
351
395
* Streaming advisors handle requests and responses as continuous streams, using reactive programming concepts (e.g., Flux for responses).
@@ -356,15 +400,15 @@ image::advisors-non-stream-vs-stream.jpg[Advisors Streaming vs Non-Streaming Flo
356
400
[source,java]
357
401
----
358
402
@Override
359
-
public Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain) {
403
+
public Flux<ChatClientResponse> adviseStream(ChatClientRequest chatClientRequest, StreamAdvisorChain chain) {
360
404
361
-
return Mono.just(advisedRequest)
405
+
return Mono.just(chatClientRequest)
362
406
.publishOn(Schedulers.boundedElastic())
363
407
.map(request -> {
364
408
// This can be executed by blocking and non-blocking Threads.
0 commit comments