fix: enable handling video_url in litellm and chat completions models#2614
fix: enable handling video_url in litellm and chat completions models#2614seratch merged 7 commits intoopenai:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a9dee6b7f6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fbc09d1a86
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
seratch
left a comment
There was a problem hiding this comment.
I took another pass on the latest head (4fef52e4) and re-checked it with a live runtime verification.
The previous tool-output regression looks fixed in this revision, but I still see one remaining blocker:
src/agents/models/chatcmpl_converter.pynow acceptsvideo_urlin the user-message path, andOpenAIChatCompletionsModelalways sends that converted payload through the default Chat Completions path (src/agents/models/openai_chatcompletions.py). On official OpenAI, that still does not work.- I verified this through the SDK/runtime path, not just by reading the converter: a normal image input succeeds, but a
video_urlinput returns a backend400 invalid_valuefrom Chat Completions. The error says the supported content-part types aretext,image_url,input_audio,refusal,audio, andfile. - That means this PR currently broadens the default official-OpenAI behavior and turns what used to be a local validation failure into a backend failure.
Given that, I do not think this is ready as an unconditional change for the default OpenAI Chat Completions path. If the goal is support for non-OpenAI OpenAI-compatible providers, I think this needs provider/base-URL gating so the official OpenAI path still rejects video_url locally. Also, if LiteLLM adapter layer (or LiteLLM itself) is the right place to enhance for this need, it'd be a more natural way.
|
@codex review again |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
seratch
left a comment
There was a problem hiding this comment.
Thanks for working on this. Now I see the value for developers who want to use video input supported models on openrouter. Having some unit tests would be ideal, but it's okay to skip them this time
Summary
video_urlsupport to the chat-completions converter inopenai/openai-agents-pythonProblem
The SDK currently handles text, image, audio, and file content parts in the chat-completions converter, but it rejects OpenRouter-style video inputs shaped like:
{ "type": "video_url", "video_url": { "url": data_url, }, }Change
Updated
src/agents/models/chatcmpl_converter.pyso that:Converter.extract_all_content()recognizesvideo_urlvideo_urlcontent instead of rejecting itVerification
uv run python -c "from agents.models.chatcmpl_converter import Converter; print(Converter.extract_all_content([{'type': 'input_text', 'text': 'x'}, {'type': 'video_url', 'video_url': {'url': 'data:video/mp4;base64,abc'}}]))"