Skip to content

Commit f9d5fd4

Browse files
committed
AssistantMessage map, structured and docs
1 parent f637db8 commit f9d5fd4

File tree

8 files changed

+327
-7
lines changed

8 files changed

+327
-7
lines changed

Diff for: docs/providers/anthropic.md

+98-2
Original file line numberDiff line numberDiff line change
@@ -49,21 +49,25 @@ If you prefer, you can use the `AnthropicCacheType` Enum like so:
4949
use EchoLabs\Enums\Provider;
5050
use EchoLabs\Prism\Providers\Anthropic\Enums\AnthropicCacheType;
5151
use EchoLabs\Prism\ValueObjects\Messages\UserMessage;
52+
use EchoLabs\Prism\ValueObjects\Messages\Support\Document;
5253

5354
(new UserMessage('I am a long re-usable user message.'))->withProviderMeta(Provider::Anthropic, ['cacheType' => AnthropicCacheType::ephemeral])
5455
```
5556
Note that you must use the `withMessages()` method in order to enable prompt caching, rather than `withPrompt()` or `withSystemPrompt()`.
5657

5758
Please ensure you read Anthropic's [prompt caching documentation](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching), which covers some important information on e.g. minimum cacheable tokens and message order consistency.
5859

59-
### PDF Support
60+
## Documents
61+
62+
### PDF Documents
6063

6164
Prism supports Anthropic PDF processing on UserMessages via the `$additionalContent` parameter:
6265

6366
```php
6467
use EchoLabs\Enums\Provider;
6568
use EchoLabs\Prism\Prism;
6669
use EchoLabs\Prism\ValueObjects\Messages\UserMessage;
70+
use EchoLabs\Prism\ValueObjects\Messages\Support\Document;
6771

6872
Prism::text()
6973
->using(Provider::Anthropic, 'claude-3-5-sonnet-20241022')
@@ -80,14 +84,15 @@ Prism::text()
8084
```
8185
Anthropic use vision to process PDFs, and consequently there are some limitations detailed in their [feature documentation](https://docs.anthropic.com/en/docs/build-with-claude/pdf-support).
8286

83-
### Txt and md Document Support
87+
### Txt and md Documents
8488

8589
Prism supports txt/md documents on UserMessages via the `$additionalContent` parameter:
8690

8791
```php
8892
use EchoLabs\Enums\Provider;
8993
use EchoLabs\Prism\Prism;
9094
use EchoLabs\Prism\ValueObjects\Messages\UserMessage;
95+
use EchoLabs\Prism\ValueObjects\Messages\Support\Document;
9196

9297
Prism::text()
9398
->using(Provider::Anthropic, 'claude-3-5-sonnet-20241022')
@@ -103,6 +108,97 @@ Prism::text()
103108

104109
```
105110

111+
### Custom content documents
112+
113+
Prism supports Anthropic's "custom content documents", which is primarily for use with citations (see below) where you need citations to reference your own chunking strategy.
114+
115+
```php
116+
use EchoLabs\Enums\Provider;
117+
use EchoLabs\Prism\Prism;
118+
use EchoLabs\Prism\ValueObjects\Messages\UserMessage;
119+
use EchoLabs\Prism\ValueObjects\Messages\Support\Document;
120+
121+
Prism::text()
122+
->using(Provider::Anthropic, 'claude-3-5-sonnet-20241022')
123+
->withMessages([
124+
new UserMessage(
125+
content: "Is the grass green and the sky blue?",
126+
additionalContent: [
127+
Document::fromChunks(["The grass is green.", "Flamingos are pink.", "The sky is blue."])
128+
]
129+
)
130+
])
131+
->generate();
132+
```
133+
134+
## Citations
135+
136+
Prism supports [Anthropic's citations feature](https://docs.anthropic.com/en/docs/build-with-claude/citations) for both text and structured.
137+
138+
Please note however that due to Anthropic not supporting "native" structured output, and Prism's workaround for this, the output can be unreliable. You should therefore ensure you implement proper error handling for the scenario where Anthropic does not return a valid decodable schema.
139+
140+
### Enabling citations
141+
142+
Anthropic require citations to be enabled on all documents in a request. To enable them, using the `withProviderMeta()` method when building your request:
143+
144+
```php
145+
use EchoLabs\Enums\Provider;
146+
use EchoLabs\Prism\Prism;
147+
use EchoLabs\Prism\ValueObjects\Messages\UserMessage;
148+
use EchoLabs\Prism\ValueObjects\Messages\Support\Document;
149+
150+
$response = Prism::text()
151+
->using(Provider::Anthropic, 'claude-3-5-sonnet-20241022')
152+
->withMessages([
153+
new UserMessage(
154+
content: "Is the grass green and the sky blue?",
155+
additionalContent: [
156+
Document::fromChunks(["The grass is green.", "Flamingos are pink.", "The sky is blue."])
157+
]
158+
)
159+
])
160+
->withProviderMeta(Provider::Anthropic, ['citations' => true])
161+
->generate();
162+
```
163+
164+
### Accessing citations
165+
166+
You can access the chunked output with its citations via the additionalContent property on a response, which returns an array of `Providers\Anthropic\ValueObjects\MessagePartWithCitations`s.
167+
168+
As a rough worked example, let's assume you want to implement footnotes. You'll need to loop through those chunks and (1) re-construct the message with links to the footnotes; and (2) build an array of footnotes to loop through in your frontend.
169+
170+
```php
171+
use EchoLabs\Prism\Providers\Anthropic\ValueObjects\MessagePartWithCitations;
172+
use EchoLabs\Prism\Providers\Anthropic\ValueObjects\Citation;
173+
174+
$messageChunks = $response->additionalContent['messagePartsWithCitations'];
175+
176+
$text = '';
177+
$footnotes = '';
178+
179+
$footnoteId = 1;
180+
181+
/** @var MessagePartWithCitations $messageChunk */
182+
foreach ($messageChunks as $messageChunk) {
183+
$text .= $messageChunk->text;
184+
185+
/** @var Citation $citation */
186+
foreach ($messageChunk->citations as $citation) {
187+
$footnotes[] = [
188+
'id' => $footnoteId,
189+
'document_title' => $citation->documentTitle,
190+
'reference_start' => $citation->startIndex,
191+
'reference_end' => $citation->endIndex
192+
];
193+
194+
$text .= '<sup><a href="#footnote-'.$footnoteId.'">'.$footnoteId.'</a></sup>';
195+
196+
$footnoteId++;
197+
}
198+
}
199+
```
200+
201+
106202
## Considerations
107203
### Message Order
108204

Diff for: src/Providers/Anthropic/Handlers/Structured.php

+5-3
Original file line numberDiff line numberDiff line change
@@ -87,9 +87,11 @@ protected function buildProviderResponse(): ProviderResponse
8787
protected function appendMessageForJsonMode(): PrismRequest
8888
{
8989
return $this->request->addMessage(new UserMessage(sprintf(
90-
"Respond with ONLY JSON that matches the following schema: \n %s",
91-
json_encode($this->request->schema->toArray(), JSON_PRETTY_PRINT)
90+
"Respond with ONLY JSON that matches the following schema: \n %s %s",
91+
json_encode($this->request->schema->toArray(), JSON_PRETTY_PRINT),
92+
($this->request->providerMeta(Provider::Anthropic)['citations'] ?? false)
93+
? "\n\n Return the JSON as a single text block with a single set of citations."
94+
: ''
9295
)));
93-
9496
}
9597
}

Diff for: src/Providers/Anthropic/Maps/MessageMap.php

+10-2
Original file line numberDiff line numberDiff line change
@@ -120,10 +120,18 @@ protected static function mapUserMessage(UserMessage $message, array $requestPro
120120
*/
121121
protected static function mapAssistantMessage(AssistantMessage $message): array
122122
{
123+
$cacheType = data_get($message->providerMeta(Provider::Anthropic), 'cacheType', null);
124+
123125
$content = [];
124126

125-
if ($message->content !== '' && $message->content !== '0') {
126-
$cacheType = data_get($message->providerMeta(Provider::Anthropic), 'cacheType', null);
127+
if (isset($message->additionalContent['messagePartsWithCitations'])) {
128+
foreach ($message->additionalContent['messagePartsWithCitations'] as $part) {
129+
$content[] = array_filter([
130+
...$part->toContentBlock(),
131+
'cache_control' => $cacheType ? ['type' => $cacheType instanceof BackedEnum ? $cacheType->value : $cacheType] : null,
132+
]);
133+
}
134+
} elseif ($message->content !== '' && $message->content !== '0') {
127135

128136
$content[] = array_filter([
129137
'type' => 'text',

Diff for: src/Providers/Anthropic/ValueObjects/MessagePartWithCitations.php

+28
Original file line numberDiff line numberDiff line change
@@ -40,4 +40,32 @@ public static function fromContentBlock(array $data): self
4040
}, $data['citations'] ?? [])
4141
);
4242
}
43+
44+
/**
45+
* @return array<string,mixed>
46+
*/
47+
public function toContentBlock(): array
48+
{
49+
return [
50+
'type' => 'text',
51+
'text' => $this->text,
52+
'citations' => array_map(function (Citation $citation): array {
53+
$indexPropertyCommonPart = match ($citation->type) {
54+
'page_location' => 'page_number',
55+
'char_location' => 'char_index',
56+
'content_block_location' => 'block_index',
57+
default => throw new \InvalidArgumentException("Unknown citation type: {$citation->type}"),
58+
};
59+
60+
return [
61+
'type' => $citation->type,
62+
'cited_text' => $citation->citedText,
63+
'document_index' => $citation->documentIndex,
64+
'document_title' => $citation->documentTitle,
65+
"start_$indexPropertyCommonPart" => $citation->startIndex,
66+
"end_$indexPropertyCommonPart" => $citation->endIndex,
67+
];
68+
}, $this->citations),
69+
];
70+
}
4371
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"id":"msg_011pEb55M2m5Htxj5jhCWW3e","type":"message","role":"assistant","model":"claude-3-5-sonnet-20241022","content":[{"type":"text","text":"{\"answer\": true}","citations":[{"type":"content_block_location","cited_text":"The grass is green.","document_index":0,"document_title":null,"start_block_index":0,"end_block_index":1},{"type":"content_block_location","cited_text":"The sky is blue.","document_index":0,"document_title":null,"start_block_index":2,"end_block_index":3}]}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":693,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"output_tokens":28}}

Diff for: tests/Providers/Anthropic/AnthropicTextTest.php

+6
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,8 @@
271271
->withProviderMeta(Provider::Anthropic, ['citations' => true])
272272
->generate();
273273

274+
expect($response->text)->toEqual('According to the text, the grass is green and the sky is blue.');
275+
274276
expect($response->additionalContent['messagePartsWithCitations'])->toHaveCount(5);
275277
expect($response->additionalContent['messagePartsWithCitations'][0])->toBeInstanceOf(MessagePartWithCitations::class);
276278

@@ -309,6 +311,8 @@
309311
->withProviderMeta(Provider::Anthropic, ['citations' => true])
310312
->generate();
311313

314+
expect($response->text)->toBe("According to the documents:\nThe grass is green and the sky is blue.");
315+
312316
expect($response->additionalContent['messagePartsWithCitations'])->toHaveCount(5);
313317
expect($response->additionalContent['messagePartsWithCitations'][0])->toBeInstanceOf(MessagePartWithCitations::class);
314318

@@ -347,6 +351,8 @@
347351
->withProviderMeta(Provider::Anthropic, ['citations' => true])
348352
->generate();
349353

354+
expect($response->text)->toBe('According to the documents, the grass is green and the sky is blue.');
355+
350356
expect($response->additionalContent['messagePartsWithCitations'])->toHaveCount(5);
351357
expect($response->additionalContent['messagePartsWithCitations'][0])->toBeInstanceOf(MessagePartWithCitations::class);
352358

Diff for: tests/Providers/Anthropic/MessageMapTest.php

+138
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
use EchoLabs\Prism\Enums\Provider;
88
use EchoLabs\Prism\Providers\Anthropic\Enums\AnthropicCacheType;
99
use EchoLabs\Prism\Providers\Anthropic\Maps\MessageMap;
10+
use EchoLabs\Prism\Providers\Anthropic\ValueObjects\MessagePartWithCitations;
1011
use EchoLabs\Prism\ValueObjects\Messages\AssistantMessage;
1112
use EchoLabs\Prism\ValueObjects\Messages\Support\Document;
1213
use EchoLabs\Prism\ValueObjects\Messages\Support\Image;
@@ -496,3 +497,140 @@
496497
],
497498
]]);
498499
});
500+
501+
it('maps an assistant message with PDF citations back to its original format', function (): void {
502+
$block_one = [
503+
'type' => 'text',
504+
'text' => '.',
505+
];
506+
507+
$block_two = [
508+
'type' => 'text',
509+
'text' => 'the grass is green',
510+
'citations' => [
511+
[
512+
'type' => 'page_location',
513+
'cited_text' => 'The grass is green. ',
514+
'document_index' => 0,
515+
'document_title' => 'All aboout the grass and the sky',
516+
'start_page_number' => 1,
517+
'end_page_number' => 2,
518+
],
519+
],
520+
];
521+
522+
$block_three = [
523+
'type' => 'text',
524+
'text' => ' and ',
525+
];
526+
527+
$block_four = [
528+
'type' => 'text',
529+
'text' => 'the sky is blue',
530+
'citations' => [
531+
[
532+
'type' => 'page_location',
533+
'cited_text' => 'The sky is blue.',
534+
'document_index' => 0,
535+
'document_title' => 'All aboout the grass and the sky',
536+
'start_page_number' => 1,
537+
'end_page_number' => 2,
538+
],
539+
],
540+
];
541+
542+
$block_five = [
543+
'type' => 'text',
544+
'text' => '.',
545+
];
546+
547+
expect(MessageMap::map([
548+
new AssistantMessage(
549+
content: 'According to the text, the grass is green and the sky is blue.',
550+
additionalContent: [
551+
'messagePartsWithCitations' => [
552+
MessagePartWithCitations::fromContentBlock($block_one),
553+
MessagePartWithCitations::fromContentBlock($block_two),
554+
MessagePartWithCitations::fromContentBlock($block_three),
555+
MessagePartWithCitations::fromContentBlock($block_four),
556+
MessagePartWithCitations::fromContentBlock($block_five),
557+
],
558+
]
559+
),
560+
]))->toBe([[
561+
'role' => 'assistant',
562+
'content' => [
563+
$block_one,
564+
$block_two,
565+
$block_three,
566+
$block_four,
567+
$block_five,
568+
],
569+
]]);
570+
});
571+
572+
it('maps an assistant message with text document citations back to its original format', function (): void {
573+
$block = [
574+
'type' => 'text',
575+
'text' => 'the grass is green',
576+
'citations' => [
577+
[
578+
'type' => 'char_location',
579+
'cited_text' => 'The grass is green. ',
580+
'document_index' => 0,
581+
'document_title' => 'All aboout the grass and the sky',
582+
'start_char_index' => 1,
583+
'end_char_index' => 20,
584+
],
585+
],
586+
];
587+
588+
expect(MessageMap::map([
589+
new AssistantMessage(
590+
content: 'According to the text, the grass is green and the sky is blue.',
591+
additionalContent: [
592+
'messagePartsWithCitations' => [
593+
MessagePartWithCitations::fromContentBlock($block),
594+
],
595+
]
596+
),
597+
]))->toBe([[
598+
'role' => 'assistant',
599+
'content' => [
600+
$block,
601+
],
602+
]]);
603+
});
604+
605+
it('maps an assistant message with custom content document citations back to its original format', function (): void {
606+
$block = [
607+
'type' => 'text',
608+
'text' => 'the grass is green',
609+
'citations' => [
610+
[
611+
'type' => 'content_block_location',
612+
'cited_text' => 'The grass is green. ',
613+
'document_index' => 0,
614+
'document_title' => 'All aboout the grass and the sky',
615+
'start_block_index' => 0,
616+
'end_block_index' => 1,
617+
],
618+
],
619+
];
620+
621+
expect(MessageMap::map([
622+
new AssistantMessage(
623+
content: 'According to the text, the grass is green and the sky is blue.',
624+
additionalContent: [
625+
'messagePartsWithCitations' => [
626+
MessagePartWithCitations::fromContentBlock($block),
627+
],
628+
]
629+
),
630+
]))->toBe([[
631+
'role' => 'assistant',
632+
'content' => [
633+
$block,
634+
],
635+
]]);
636+
});

0 commit comments

Comments
 (0)