[pref] calculate local_total_toks in build meata #5543

ader47 · 2025-12-31T01:56:28Z

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@7157596

Signed-off-by: F.Liu <[email protected]>

gemini-code-assist

Code Review

This pull request introduces a performance optimization by pre-calculating local_total_toks during the metadata build process, which avoids redundant calculations in _compute_prefill_context. The changes are logical and correctly implemented. I've added suggestions for a further minor optimization to avoid a redundant sum() operation during the initial calculation of local_total_toks.

gemini-code-assist · 2025-12-31T01:58:13Z

vllm_ascend/attention/attention_cp.py

                                                                         dcp_rank]
                actual_seq_lengths_kv = torch.cumsum(
                    local_chunked_kv_lens_rank, dim=0).tolist()
+                local_total_toks = local_chunked_kv_lens_rank.sum()


To avoid a redundant sum() operation, you can obtain local_total_toks from actual_seq_lengths_kv, which is calculated just before. The total sum is the last element of the cumulative sum list. This change will make local_total_toks an integer. Note that a corresponding change is needed at the call site to remove the .item() call.

Suggested change

local_total_toks = local_chunked_kv_lens_rank.sum()

local_total_toks = actual_seq_lengths_kv[-1] if actual_seq_lengths_kv else 0

gemini-code-assist · 2025-12-31T01:58:13Z

vllm_ascend/attention/attention_cp.py

                        batch_chunk_seq_mask=batch_chunk_seq_mask,
-                        chunk_seq_mask_filtered_indices=chunk_seq_mask_filtered_indices
+                        chunk_seq_mask_filtered_indices=chunk_seq_mask_filtered_indices,
+                        local_total_toks=local_total_toks.item()


Following the suggestion on line 155, local_total_toks is now an integer, so the call to .item() is no longer necessary and would cause an error.

Suggested change

local_total_toks=local_total_toks.item()

local_total_toks=local_total_toks

Signed-off-by: F.Liu <[email protected]>

github-actions · 2025-12-31T02:55:36Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions · 2026-01-04T08:43:29Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

F.Liu and others added 2 commits December 25, 2025 14:55

[Bugfix] Fix Qwen accuracy error

c78d083

Signed-off-by: F.Liu <[email protected]>

Merge branch 'vllm-project:main' into main

f9dad48

gemini-code-assist bot reviewed Dec 31, 2025

View reviewed changes

[pref] calculate local_total_toks in build meata

b1ea888

Signed-off-by: F.Liu <[email protected]>

ader47 force-pushed the total_toks branch from f274cfb to b1ea888 Compare December 31, 2025 02:47

github-actions bot added the merge-conflicts label Jan 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pref] calculate local_total_toks in build meata #5543

[pref] calculate local_total_toks in build meata #5543

ader47 commented Dec 31, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 31, 2025

Uh oh!

gemini-code-assist bot Dec 31, 2025

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

github-actions bot commented Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	local_total_toks = local_chunked_kv_lens_rank.sum()
	local_total_toks = actual_seq_lengths_kv[-1] if actual_seq_lengths_kv else 0

	local_total_toks=local_total_toks.item()
	local_total_toks=local_total_toks

[pref] calculate local_total_toks in build meata #5543

Are you sure you want to change the base?

[pref] calculate local_total_toks in build meata #5543

Conversation

ader47 commented Dec 31, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

github-actions bot commented Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ader47 commented Dec 31, 2025 •

edited by github-actions bot

Loading