[Refactor] Add expert processed token count output for DispatchFFNCombine/DispatchFFNCombineBF16 #6402

guanguan0308 · 2026-01-30T02:26:18Z

What this PR does / why we need it?

Add New Output for Expert Token Count
An additional output tensor expert_token_nums is added to both operators to meet the requirement of tracking token distribution among experts:

Tensor Name: expert_token_nums
Dimension: 1D tensor
Shape: (local_expert_num,)
Data Type: int32
Semantics: Represents the number of tokens actually received by each expert on the current card.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.14.1
vLLM main: vllm-project/vllm@dc917cc

github-actions · 2026-01-30T02:26:34Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces a new output, expert_token_nums, to the DispatchFFNCombine operator and its BF16 variant. The changes correctly propagate this new output through the operator's definition, host-side functions, and kernel implementations. However, there are a few issues to address. Firstly, two header files contain redundant inclusions, which should be cleaned up for better code quality. More importantly, adding a new required output is a breaking API change. It appears that downstream consumers, such as the Python bindings and tests, have not been updated to reflect this change, which will likely cause them to fail.

csrc/dispatch_ffn_combine/op_host/dispatch_ffn_combine_def.cpp

csrc/dispatch_ffn_combine_bf16/op_host/dispatch_ffn_combine_bf16_def.cpp

Signed-off-by: guanguan0308 <[email protected]>

github-actions · 2026-01-30T13:23:34Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: guanguan0308 <[email protected]>

guanguan0308 requested a review from zzzzwwjj as a code owner January 30, 2026 02:26

guanguan0308 requested a review from wangxiyuan as a code owner January 30, 2026 02:27

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

csrc/dispatch_ffn_combine/op_host/dispatch_ffn_combine_def.cpp Show resolved Hide resolved

csrc/dispatch_ffn_combine_bf16/op_host/dispatch_ffn_combine_bf16_def.cpp Show resolved Hide resolved

guanguan0308 added 3 commits January 30, 2026 10:31

update dispatchffncombine

046c560

Signed-off-by: guanguan0308 <[email protected]>

updata dispatchffncommbinebf16

f632619

Signed-off-by: guanguan0308 <[email protected]>

update torch

7598bb8

Signed-off-by: guanguan0308 <[email protected]>

guanguan0308 force-pushed the add_expert_token_nums branch from 3253260 to 7598bb8 Compare January 30, 2026 02:33

guanguan0308 changed the title ~~Add expert token nums~~ [Refactor] Extract common code for DispatchFFNCombine/DispatchFFNCombineBF16 and add expert processed token count output Jan 30, 2026

guanguan0308 changed the title ~~[Refactor] Extract common code for DispatchFFNCombine/DispatchFFNCombineBF16 and add expert processed token count output~~ [Refactor] Add expert processed token count output Jan 30, 2026

guanguan0308 changed the title ~~[Refactor] Add expert processed token count output~~ [Refactor] Add expert processed token count output for DispatchFFNCombine/DispatchFFNCombineBF16 Jan 30, 2026

fix error

261f697

Signed-off-by: guanguan0308 <[email protected]>

guanguan0308 force-pushed the add_expert_token_nums branch from 3e182c3 to 261f697 Compare January 30, 2026 06:46

guanguan0308 added 3 commits January 30, 2026 15:02

fix error

bb2905a

Signed-off-by: guanguan0308 <[email protected]>

fix error

4b0ef92

Signed-off-by: guanguan0308 <[email protected]>

fix error

e69ab8b

Signed-off-by: guanguan0308 <[email protected]>

guanguan0308 requested review from realliujiaxu and whx-sjtu as code owners January 30, 2026 09:24

fix error

53edf6d

Signed-off-by: guanguan0308 <[email protected]>

github-actions bot added the merge-conflicts label Jan 30, 2026

Merge branch 'main' into add_expert_token_nums

30ed1ec

Signed-off-by: guanguan0308 <[email protected]>

github-actions bot removed the merge-conflicts label Jan 31, 2026

guanguan0308 added 4 commits January 31, 2026 09:24

fiz error

4c0baab

Signed-off-by: guanguan0308 <[email protected]>

chore: trigger CI pipeline for PR

6ceda2a

Signed-off-by: guanguan0308 <[email protected]>

chore: trigger CI pipeline for PR

09b968e

Signed-off-by: guanguan0308 <[email protected]>

chore: trigger CI pipeline for PR

8ee2119

Signed-off-by: guanguan0308 <[email protected]>

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Jan 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] Add expert processed token count output for DispatchFFNCombine/DispatchFFNCombineBF16 #6402

[Refactor] Add expert processed token count output for DispatchFFNCombine/DispatchFFNCombineBF16 #6402

guanguan0308 commented Jan 30, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Refactor] Add expert processed token count output for DispatchFFNCombine/DispatchFFNCombineBF16 #6402

Are you sure you want to change the base?

[Refactor] Add expert processed token count output for DispatchFFNCombine/DispatchFFNCombineBF16 #6402

Conversation

guanguan0308 commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

guanguan0308 commented Jan 30, 2026 •

edited

Loading