[gpt-oss] Small bug fixes for frontend #22512

heheda12345 · 2025-08-08T09:09:30Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Fix several bugs for gpt-oss response API

Test Plan

triton: 3.4.0+git663e04e8 torch: 2.9.0.dev20250723+cu128, h100

EXA_API_KEY=**** VLLM_ENABLE_RESPONSES_API_STORE=1 pytest -vs test_response_api_with_harmony.py
change the above test from "--tool-server", "demo" to "--tool-server", "localhost:port1,localhost:port2" (where port1 and port2 are MCP clients for builtin tools)

Test Result

pass all tests
test_streaming fails as expected (because streaming + MCP + tool call is not supported yet)

(Optional) Documentation Update

NOTES

Should be merged after #22431 and #22421
needs triton: 3.4.0+git663e04e8 and torch: 2.9.0.dev20250723+cu128

github-actions · 2025-08-08T09:09:38Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

mergify · 2025-08-08T09:10:07Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @heheda12345.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

gemini-code-assist

Code Review

This pull request introduces several bug fixes for the gpt-oss response API and adds support for Triton kernels for MoE layers with mxfp4 quantization. The changes are extensive, touching API serving logic, context management, and model execution layers. My review focuses on the correctness and robustness of these changes. I've identified a critical issue in the streaming response generation logic where item and content indices are not being updated, and a high-severity issue in tool call argument parsing that could lead to unhandled exceptions.

gemini-code-assist · 2025-08-08T09:11:12Z

vllm/entrypoints/openai/serving_responses.py

+        current_content_index = 0  # FIXME: this number is never changed
+        current_output_index = 0
+        current_item_id = ""  # FIXME: this number is never changed


The variables current_content_index and current_item_id are initialized to static values and are never updated within the responses_stream_generator loop, as indicated by the FIXME comments. This is a critical issue as it will result in all streamed output items having an empty or incorrect ID and a content index of 0, which violates the API contract and will likely cause issues for clients. These variables should be updated with appropriate values for each new output item, likely when ctx.is_expecting_start() is true.

gemini-code-assist · 2025-08-08T09:11:12Z

vllm/entrypoints/harmony_utils.py

+            action = ActionFind(pattern=browser_call["pattern"],
+                                url=f"cursor:{browser_call.get('url', '')}",
+                                type="find")


The direct access browser_call["pattern"] will raise a KeyError if the pattern key is missing from the browser_call dictionary, which can occur if the model output is malformed. This would cause an unhandled exception and crash the request. It's safer to use the .get() method and handle the case where the key is missing, similar to how query and url are handled in the surrounding code.

pattern = browser_call.get("pattern") if pattern is None: raise ValueError("Missing 'pattern' in browser.find call") action = ActionFind(pattern=pattern, url=f"cursor:{browser_call.get('url', '')}", type="find")

Signed-off-by: Chen Zhang <[email protected]>

heheda12345 requested review from tlrmchlsmth, WoosukKwon, aarnphm, mgoin and robertgshaw2-redhat as code owners August 8, 2025 09:09

mergify bot added frontend v1 labels Aug 8, 2025

mergify bot added the needs-rebase label Aug 8, 2025

gemini-code-assist bot reviewed Aug 8, 2025

View reviewed changes

heheda12345 mentioned this pull request Aug 9, 2025

[gpt-oss] Add test for response API + harmony (but skipped) #22554

Merged

4 tasks

mergify bot added the gpt-oss Related to GPT-OSS models label Aug 11, 2025

heheda12345 added 3 commits August 11, 2025 20:36

small bug fixes to enable tests

9db3300

Signed-off-by: Chen Zhang <[email protected]>

fix mcp

61bc522

Signed-off-by: Chen Zhang <[email protected]>

add python tool patch

4c58ba0

Signed-off-by: Chen Zhang <[email protected]>

heheda12345 force-pushed the frontend-test branch from 855cff8 to 4c58ba0 Compare August 12, 2025 04:28

mergify bot removed the needs-rebase label Aug 12, 2025

rename

b01e0e2

Signed-off-by: Chen Zhang <[email protected]>

heheda12345 added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 12, 2025

simon-mo approved these changes Aug 12, 2025

View reviewed changes

simon-mo merged commit ad344ef into vllm-project:main Aug 12, 2025
9 of 15 checks passed

aarnphm pushed a commit to aarnphm/vllm that referenced this pull request Aug 13, 2025

[gpt-oss] Small bug fixes for frontend (vllm-project#22512)

89d7ccd

Signed-off-by: Chen Zhang <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[gpt-oss] Small bug fixes for frontend #22512

[gpt-oss] Small bug fixes for frontend #22512

Uh oh!

heheda12345 commented Aug 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 8, 2025

Uh oh!

mergify bot commented Aug 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 8, 2025

Uh oh!

gemini-code-assist bot Aug 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[gpt-oss] Small bug fixes for frontend #22512

[gpt-oss] Small bug fixes for frontend #22512

Uh oh!

Conversation

heheda12345 commented Aug 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

NOTES

Uh oh!

github-actions bot commented Aug 8, 2025

Uh oh!

mergify bot commented Aug 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

heheda12345 commented Aug 8, 2025 •

edited by github-actions bot

Loading