[Bugfix] Fix weight transpose in RL scenarios #5567

panchao-hub · 2026-01-04T02:20:51Z

What this PR does / why we need it?

In the training-inference switching scenario, there is no need to resume the model weights during KV cache resumption, as this would lead to format mismatch.

Does this PR introduce any user-facing change?

No

How was this patch tested?

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@7157596

gemini-code-assist

Code Review

This pull request aims to fix an issue with weight transposition in Reinforcement Learning scenarios by making the weight resumption conditional. The change introduces a check to ensure that weight transposition only occurs when the 'weights' tag is specified during the wake_up process. However, there is a critical bug in the implementation of this check. The condition ["weights"] in tags is used, which will not work as intended for a list of strings. It should be 'weights' in tags. This error will prevent the weights from being correctly transposed, defeating the purpose of the fix. I've provided a suggestion to correct this.

gemini-code-assist · 2026-01-04T02:22:14Z

vllm_ascend/worker/worker.py

-                w13_data = param.transpose(1, 2)
-                w13_data = torch.nn.Parameter(w13_data, requires_grad=False)
-                setattr(parent_module, param_name, w13_data)
+        if tags is None or ["weights"] in tags:


The condition ["weights"] in tags appears to be incorrect. Since tags is a list[str], this check will likely always evaluate to false. The intention seems to be to check for the presence of the "weights" tag, which should be written as 'weights' in tags. This is a critical issue as it could prevent the model weights from being correctly restored, leading to format mismatches and incorrect behavior in RL scenarios.

Suggested change

if tags is None or ["weights"] in tags:

if tags is None or "weights" in tags:

github-actions · 2026-01-04T03:22:02Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: p00465316 <[email protected]>

gemini-code-assist bot reviewed Jan 4, 2026

View reviewed changes

panchao-hub force-pushed the fix_transpose branch from b7fd92b to 1c06b48 Compare January 4, 2026 06:12

[Bugfix] Fix weight transpose in RL scenarios

1afd760

Signed-off-by: p00465316 <[email protected]>

panchao-hub force-pushed the fix_transpose branch from 1c06b48 to 1afd760 Compare January 4, 2026 07:03

wangxiyuan approved these changes Jan 4, 2026

View reviewed changes

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Jan 4, 2026

wangxiyuan merged commit 42774df into vllm-project:main Jan 5, 2026
55 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix weight transpose in RL scenarios #5567

[Bugfix] Fix weight transpose in RL scenarios #5567

panchao-hub commented Jan 4, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 4, 2026

Uh oh!

github-actions bot commented Jan 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if tags is None or ["weights"] in tags:
	if tags is None or "weights" in tags:

[Bugfix] Fix weight transpose in RL scenarios #5567

[Bugfix] Fix weight transpose in RL scenarios #5567

Conversation

panchao-hub commented Jan 4, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

panchao-hub commented Jan 4, 2026 •

edited by github-actions bot

Loading