[Model][QwenVL] Simplify cos/sin rotary embedding indexing #28962

lgeiger · 2025-11-18T20:50:51Z

Purpose

This is a small followup from #28798 which simplifies the indexing logic. /cc @gcanlin @Isotr0py

For Qwen3VL #28798 slightly changed behaviour where get_cos_sin() now already returns a GPU tensor. This introduces synchronous CPU to GPU copies of pos_ids when using it to index. cc1f0c5 Fixes it by moving the indices onto the GPU in a non-blocking way.

Before:

After:

Test Plan

VLLM_WORKER_MULTIPROC_METHOD=spawn lm_eval --model vllm-vlm --model_args "pretrained=Qwen/Qwen3-VL-30B-A3B-Instruct-FP8,max_model_len=10000" --tasks chartqa --batch_size auto --apply_chat_template

Test Result

Before:

Tasks	Version	Filter	Metric		Value		Stderr
chartqa	0	none	anywhere_accuracy	↑	0.8752	±	0.0066
		none	exact_match	↑	0.6448	±	0.0096
		none	relaxed_accuracy	↑	0.8656	±	0.0068

After:

Tasks	Version	Filter	Metric		Value		Stderr
chartqa	0	none	anywhere_accuracy	↑	0.8760	±	0.0066
		none	exact_match	↑	0.6388	±	0.0096
		none	relaxed_accuracy	↑	0.8656	±	0.0068

Signed-off-by: Lukas Geiger <[email protected]>

gemini-code-assist

Code Review

This pull request simplifies the rotary embedding indexing logic across several models, which improves code readability and maintainability. It also addresses a performance issue in qwen3_vl.py by moving pos_ids to the GPU asynchronously, preventing synchronous CPU-to-GPU copies.

However, the same performance optimization is missing in other models modified in this PR (glm4_1v.py, qwen2_5_vl.py, qwen2_vl.py, and qwen3_omni_moe_thinker.py), where pos_ids is still created on the CPU, leading to synchronous copies during indexing. I've added specific comments to apply the same fix to these models for consistency and performance improvement.

vllm/model_executor/models/glm4_1v.py

vllm/model_executor/models/qwen2_5_vl.py

vllm/model_executor/models/qwen2_vl.py

vllm/model_executor/models/qwen3_omni_moe_thinker.py

lgeiger added 2 commits November 18, 2025 15:50

[Model][QwenVL] Simplify cos/sin rotary embedding indexing

c4b221c

Signed-off-by: Lukas Geiger <[email protected]>

[Model][Qwen3VL] Prevent synchronous CPU-GPU copy

cc1f0c5

Signed-off-by: Lukas Geiger <[email protected]>

lgeiger requested a review from sighingnow as a code owner November 18, 2025 20:50

mergify bot added the qwen Related to Qwen models label Nov 18, 2025

gemini-code-assist bot reviewed Nov 18, 2025

View reviewed changes

vllm/model_executor/models/glm4_1v.py Show resolved Hide resolved

vllm/model_executor/models/qwen2_5_vl.py Show resolved Hide resolved

vllm/model_executor/models/qwen2_vl.py Show resolved Hide resolved

vllm/model_executor/models/qwen3_omni_moe_thinker.py Show resolved Hide resolved

Merge branch 'main' into qwen-cos-sin-indexing

f3cd40f

Isotr0py approved these changes Nov 19, 2025

View reviewed changes

Isotr0py enabled auto-merge (squash) November 19, 2025 03:04

gcanlin approved these changes Nov 19, 2025

View reviewed changes

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 19, 2025

Isotr0py merged commit 3d4e7d3 into vllm-project:main Nov 19, 2025
51 checks passed

lgeiger deleted the qwen-cos-sin-indexing branch November 19, 2025 09:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model][QwenVL] Simplify cos/sin rotary embedding indexing #28962

[Model][QwenVL] Simplify cos/sin rotary embedding indexing #28962

lgeiger commented Nov 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Model][QwenVL] Simplify cos/sin rotary embedding indexing #28962

[Model][QwenVL] Simplify cos/sin rotary embedding indexing #28962

Conversation

lgeiger commented Nov 18, 2025

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants