[Bugfix] Fix full-graph FIA seq length alignment in TND layout #6403

LICO1314 · 2026-01-30T02:28:04Z

What this PR does / why we need it?

Align actual_seq_lengths_q with padded query token count when running full-graph FIA updates to satisfy TND layout requirements.
Move the alignment helper into vllm_ascend/attention/utils.py so future FIA-based update paths can reuse it.
This prevents queryT != actualSequenceLengthQ[-1] mismatches that lead to aclnnFusedInferAttentionScoreV3 tiling failures under full-graph mode.
Fixes # (optional: link CI failure or issue if you have one)

Does this PR introduce any user-facing change?

No. This is a bug fix in full-graph parameter update alignment and does not change public APIs.

How was this patch tested?

Not run locally. Reproduced/observed in CI (full-graph + speculative decode) and the fix is a targeted alignment change.

vLLM version: v0.14.1
vLLM main: vllm-project/vllm@dc917cc

github-actions · 2026-01-30T02:28:18Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request addresses a bug in the full-graph FIA path by ensuring actual_seq_lengths_q is correctly aligned with the padded query token count. The introduction of the align_actual_seq_lengths_q helper function is a good, reusable solution. My review focuses on the correctness and safety of the changes. I've identified a critical issue in full_graph_fia where an IndexError could occur with empty request batches, and I have provided a recommendation to fix it.

gemini-code-assist · 2026-01-30T02:29:34Z

vllm_ascend/attention/attention_v1.py

+        if query.shape[0] > num_tokens:
+            num_tokens = int(query.shape[0])


The logic to determine num_tokens across lines 534-536 is not safe. attn_metadata.actual_seq_lengths_q can be an empty list if there are no requests in the batch, which will cause an IndexError when [-1] is accessed on line 534.

This logic can be rewritten to be both safe and more concise. Consider replacing lines 534-536 with the following:

unpadded_num_tokens = attn_metadata.actual_seq_lengths_q[-1] if attn_metadata.actual_seq_lengths_q else 0 num_tokens = max(unpadded_num_tokens, int(query.shape[0]))

Align actual_seq_lengths_q with padded query token count in full-graph FIA updates and share the helper in attention utils to avoid TND layout mismatches. Signed-off-by: lico67373 <[email protected]>

LICO1314 requested review from weijinqian0 and whx-sjtu as code owners January 30, 2026 02:28

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

Fix full-graph FIA seq length alignment

ec7e834

Align actual_seq_lengths_q with padded query token count in full-graph FIA updates and share the helper in attention utils to avoid TND layout mismatches. Signed-off-by: lico67373 <[email protected]>

LICO1314 force-pushed the fix-align-fia-seqlen branch from db8c6f3 to ec7e834 Compare January 30, 2026 02:38

jianzs added ready read for review ready-for-test start test by label for PR labels Jan 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix full-graph FIA seq length alignment in TND layout #6403

[Bugfix] Fix full-graph FIA seq length alignment in TND layout #6403

LICO1314 commented Jan 30, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if query.shape[0] > num_tokens:
		num_tokens = int(query.shape[0])

[Bugfix] Fix full-graph FIA seq length alignment in TND layout #6403

Are you sure you want to change the base?

[Bugfix] Fix full-graph FIA seq length alignment in TND layout #6403

Conversation

LICO1314 commented Jan 30, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LICO1314 commented Jan 30, 2026 •

edited by github-actions bot

Loading