Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anthropic PDF Handling #142

Merged
merged 4 commits into from
Jan 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 28 additions & 2 deletions docs/providers/anthropic.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,11 @@ Anthropic's prompt caching feature allows you to drastically reduce latency and
We support Anthropic prompt caching on:

- System Messages (text only)
- User Messages (Text and Image)
- User Messages (Text, Image and PDF (pdf only))
- Assistant Messages (text only)
- Tools

The API for enable prompt caching is the same for all, enabled via the `withProviderMeta()` method. Where a UserMessage contains both text and an image, both will be cached.
The API for enabling prompt caching is the same for all, enabled via the `withProviderMeta()` method. Where a UserMessage contains both text and an image or document, both will be cached.

```php
use EchoLabs\Enums\Provider;
Expand Down Expand Up @@ -55,6 +56,31 @@ Note that you must use the `withMessages()` method in order to enable prompt cac

Please ensure you read Anthropic's [prompt caching documentation](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching), which covers some important information on e.g. minimum cacheable tokens and message order consistency.

### PDF Support

Prism supports Anthropic PDF processing on UserMessages via the `$additionalContent` parameter:

```php
use EchoLabs\Enums\Provider;
use EchoLabs\Prism\Prism;
use EchoLabs\Prism\ValueObjects\Messages\UserMessage;

Prism::text()
->using(Provider::Anthropic, 'claude-3-5-sonnet-20241022')
->withMessages([
new UserMessage('Here is the document from base64', [
Document::fromBase64(base64_encode(file_get_contents('tests/Fixtures/test-pdf.pdf')), 'application/pdf'),
]),
new UserMessage('Here is the document from a local path', [
Document::fromPath('tests/Fixtures/test-pdf.pdf', 'application/pdf'),
]),
])
->generate();

```
Anthropic use vision to process PDFs, and consequently there are some limitations detailed in their [feature documentation](https://docs.anthropic.com/en/docs/build-with-claude/pdf-support).


## Considerations
### Message Order

Expand Down
22 changes: 21 additions & 1 deletion src/Providers/Anthropic/Maps/MessageMap.php
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
use EchoLabs\Prism\Contracts\Message;
use EchoLabs\Prism\Enums\Provider;
use EchoLabs\Prism\ValueObjects\Messages\AssistantMessage;
use EchoLabs\Prism\ValueObjects\Messages\Support\Document;
use EchoLabs\Prism\ValueObjects\Messages\Support\Image;
use EchoLabs\Prism\ValueObjects\Messages\SystemMessage;
use EchoLabs\Prism\ValueObjects\Messages\ToolResultMessage;
Expand Down Expand Up @@ -106,6 +107,7 @@ protected static function mapUserMessage(UserMessage $message): array
'cache_control' => $cache_control,
]),
...self::mapImageParts($message->images(), $cache_control),
...self::mapDocumentParts($message->documents(), $cache_control),
],
];
}
Expand Down Expand Up @@ -145,7 +147,7 @@ protected static function mapAssistantMessage(AssistantMessage $message): array
/**
* @param Image[] $parts
* @param array<string, mixed>|null $cache_control
* @return array<string, mixed>
* @return array<int, mixed>
*/
protected static function mapImageParts(array $parts, ?array $cache_control = null): array
{
Expand All @@ -165,4 +167,22 @@ protected static function mapImageParts(array $parts, ?array $cache_control = nu
]);
}, $parts);
}

/**
* @param Document[] $parts
* @param array<string, mixed>|null $cache_control
* @return array<int, mixed>
*/
protected static function mapDocumentParts(array $parts, ?array $cache_control = null): array
{
return array_map(fn (Document $document): array => array_filter([
'type' => 'document',
'source' => [
'type' => 'base64',
'media_type' => $document->mimeType,
'data' => $document->document,
],
'cache_control' => $cache_control,
]), $parts);
}
}
51 changes: 51 additions & 0 deletions src/ValueObjects/Messages/Support/Document.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
<?php

declare(strict_types=1);

namespace EchoLabs\Prism\ValueObjects\Messages\Support;

use Illuminate\Support\Facades\File;
use InvalidArgumentException;

/**
* Note: Prism currently only supports Documents with Anthropic.
*/
class Document
{
public function __construct(
public readonly string $document,
public readonly string $mimeType
) {}

public static function fromPath(string $path): self
{
if (! is_file($path)) {
throw new InvalidArgumentException("{$path} is not a file");
}

$content = file_get_contents($path);

if ($content === '' || $content === '0' || $content === false) {
throw new InvalidArgumentException("{$path} is empty");
}

$mimeType = File::mimeType($path);

if ($mimeType === false) {
throw new InvalidArgumentException("Could not determine mime type for {$path}");
}

return new self(
base64_encode($content),
$mimeType,
);
}

public static function fromBase64(string $document, string $mimeType): self
{
return new self(
$document,
$mimeType
);
}
}
15 changes: 14 additions & 1 deletion src/ValueObjects/Messages/UserMessage.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

use EchoLabs\Prism\Concerns\HasProviderMeta;
use EchoLabs\Prism\Contracts\Message;
use EchoLabs\Prism\ValueObjects\Messages\Support\Document;
use EchoLabs\Prism\ValueObjects\Messages\Support\Image;
use EchoLabs\Prism\ValueObjects\Messages\Support\Text;

Expand All @@ -14,7 +15,7 @@ class UserMessage implements Message
use HasProviderMeta;

/**
* @param array<int, Text|Image> $additionalContent
* @param array<int, Text|Image|Document> $additionalContent
*/
public function __construct(
protected readonly string $content,
Expand Down Expand Up @@ -43,4 +44,16 @@ public function images(): array
->where(fn ($part): bool => $part instanceof Image)
->toArray();
}

/**
* Note: Prism currently only supports Documents with Anthropic.
*
* @return Document[]
*/
public function documents(): array
{
return collect($this->additionalContent)
->where(fn ($part): bool => $part instanceof Document)
->toArray();
}
}
Binary file added tests/Fixtures/test-pdf.pdf
Binary file not shown.
62 changes: 62 additions & 0 deletions tests/Providers/Anthropic/MessageMapTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
use EchoLabs\Prism\Providers\Anthropic\Enums\AnthropicCacheType;
use EchoLabs\Prism\Providers\Anthropic\Maps\MessageMap;
use EchoLabs\Prism\ValueObjects\Messages\AssistantMessage;
use EchoLabs\Prism\ValueObjects\Messages\Support\Document;
use EchoLabs\Prism\ValueObjects\Messages\Support\Image;
use EchoLabs\Prism\ValueObjects\Messages\SystemMessage;
use EchoLabs\Prism\ValueObjects\Messages\ToolResultMessage;
Expand Down Expand Up @@ -73,6 +74,40 @@
->toBe('image/png');
});

it('maps user messages with documents from path', function (): void {
$mappedMessage = MessageMap::map([
new UserMessage('Here is the document', [
Document::fromPath('tests/Fixtures/test-pdf.pdf'),
]),
]);

expect(data_get($mappedMessage, '0.content.1.type'))
->toBe('document');
expect(data_get($mappedMessage, '0.content.1.source.type'))
->toBe('base64');
expect(data_get($mappedMessage, '0.content.1.source.data'))
->toContain(base64_encode(file_get_contents('tests/Fixtures/test-pdf.pdf')));
expect(data_get($mappedMessage, '0.content.1.source.media_type'))
->toBe('application/pdf');
});

it('maps user messages with documents from base64', function (): void {
$mappedMessage = MessageMap::map([
new UserMessage('Here is the document', [
Document::fromBase64(base64_encode(file_get_contents('tests/Fixtures/test-pdf.pdf')), 'application/pdf'),
]),
]);

expect(data_get($mappedMessage, '0.content.1.type'))
->toBe('document');
expect(data_get($mappedMessage, '0.content.1.source.type'))
->toBe('base64');
expect(data_get($mappedMessage, '0.content.1.source.data'))
->toContain(base64_encode(file_get_contents('tests/Fixtures/test-pdf.pdf')));
expect(data_get($mappedMessage, '0.content.1.source.media_type'))
->toBe('application/pdf');
});

it('does not maps user messages with images from url', function (): void {
$this->expectException(InvalidArgumentException::class);
MessageMap::map([
Expand Down Expand Up @@ -215,6 +250,33 @@
]]);
});

it('sets the cache type on a UserMessage document if cacheType providerMeta is set on message', function (): void {
expect(MessageMap::map([
(new UserMessage(
content: 'Who are you?',
additionalContent: [Document::fromPath('tests/Fixtures/test-pdf.pdf')]
))->withProviderMeta(Provider::Anthropic, ['cacheType' => 'ephemeral']),
]))->toBe([[
'role' => 'user',
'content' => [
[
'type' => 'text',
'text' => 'Who are you?',
'cache_control' => ['type' => 'ephemeral'],
],
[
'type' => 'document',
'source' => [
'type' => 'base64',
'media_type' => 'application/pdf',
'data' => base64_encode(file_get_contents('tests/Fixtures/test-pdf.pdf')),
],
'cache_control' => ['type' => 'ephemeral'],
],
],
]]);
});

it('sets the cache type on an AssistantMessage if cacheType providerMeta is set on message', function (mixed $cacheType): void {
expect(MessageMap::map([
(new AssistantMessage(content: 'Who are you?'))->withProviderMeta(Provider::Anthropic, ['cacheType' => $cacheType]),
Expand Down
Loading