-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Allow for arbitrary examples containing DSPy.Images #1801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
examples/vlm/mmmu.ipynb
Outdated
"class ColorSignature(dspy.Signature):\n", | ||
" \"\"\"Output the color of the designated image.\"\"\"\n", | ||
" image_1: dspy.Image = dspy.InputField(desc=\"An image\")\n", | ||
" image_2: dspy.Image = dspy.InputField(desc=\"An image\")\n", | ||
" images: List[dspy.Image] = dspy.InputField(desc=\"An image\")\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" images: List[dspy.Image] = dspy.InputField(desc=\"An image\")\n", | |
" images: List[dspy.Image] = dspy.InputField(desc=\"A list of images\")\n", |
Thanks for the PR @isaacbmiller. I tried this branch with this signature and sharing relevant logs here with you: class CreateTitleAndDescription(dspy.Signature):
"""Taking in a list of images and a category tree to output title, description and attributes."""
images: List[dspy.Image] = dspy.InputField(desc="list of images about the item")
categories: str = dspy.InputField(desc="category tree of the item")
title: str = dspy.OutputField(desc=title_desc)
description: str = dspy.OutputField(desc=description_desc) {
'role': 'user',
'content': '[[ ## images ## ]]\n["https://static-sd.mercdn.net/photos/m34859626959_1.jpg", "https://static-sd.mercdn.net/photos/m34859626959_2.jpg", "https://static-sd.mercdn.net/photos/m34859626959_3.jpg"]\n\n[[ ## categories ## ]]\nゲーム・おもちゃ・グッズ > トレーディングカード\n\nRespond with the corresponding output fields, starting with the field `[[ ## title ## ]]`, then `[[ ## description ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.'
} |
6a8376a
to
acf10ee
Compare
acf10ee
to
66d991d
Compare
Tagging @chenmoneygithub to make sure that none of the serialization behavior changes here interfere with your plans |
NOTE: JSONAdapter is still not supported and will be handled in a separate PR |
Hey @isaacbmiller! Do you have a sense for when I should expect this PR to get merged? |
Inspired by #1763 and to close #1767.
Supports using, saving, and loading arbitrary python objects containing images inside examples/signatures.
Prior to this you could only use a single dspy.Image in a signature.
The it does this is through changing how dspy.Images are serialized. The example will turn into
<DSPY_IMAGE_START>data:...<DSPY_IMAGE_END>
These tags were chosen arbitrarily. A tag based implementation (as opposed to just converting it into
{ "type": "image_url"...}
was chosen because of the way that images are processed inside of Adapters. For text only messages, we can use a messages dictionary with the keys of role: str, content: str.When you add more than just text, the dictionary becomes role:str, content: List[dict]. Where content contains a list of dicts, each with a type key and a corresponding data key("text" or "image_url").
We need to detect the cases where we have more than just text (done through regex searching the content) and breaking it up into the second content format when an image is detected.
A similar strategy is used for serialization/deserialization, where we just save the base64 inside of the tags as a string.