[Kernel] Make rotary_embedding ops more flexible with input shape #12777

Isotr0py · 2025-02-05T11:14:36Z

Make rotary_embedding ops work with qk having shape of [seq_len, num_heads, head_dim]
Clean up deepseek-v2/v3 and mla attention implementation

Signed-off-by: Isotr0py <[email protected]>

github-actions · 2025-02-05T11:14:49Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: Isotr0py <[email protected]>

tlrmchlsmth

Thanks for the fix! I think it would be best to add some TORCH_CHECKs so we don't hit another illegal memory access like you did when running DeepSeek 2VL, but otherwise looks good.

csrc/pos_encoding_kernels.cu

tests/kernels/test_pos_encoding.py

Signed-off-by: Isotr0py <[email protected]>

Isotr0py added 3 commits February 5, 2025 12:00

fix pe ops num_heads calculation

8e90319

Signed-off-by: Isotr0py <[email protected]>

add kernel tests

babe8a1

Signed-off-by: Isotr0py <[email protected]>

fix seq_len dim idx

81e3b1d

Signed-off-by: Isotr0py <[email protected]>

Isotr0py added 4 commits February 5, 2025 19:29

add batched test

1e8acce

Signed-off-by: Isotr0py <[email protected]>

Merge branch 'vllm-project:main' into fix-rope-shape

817d642

clean up deepseek

e3f9b42

Signed-off-by: Isotr0py <[email protected]>

make mypy happy

684881e

Signed-off-by: Isotr0py <[email protected]>

Isotr0py marked this pull request as ready for review February 5, 2025 14:09

Isotr0py requested review from tlrmchlsmth and WoosukKwon as code owners February 5, 2025 14:09

tlrmchlsmth approved these changes Feb 5, 2025

View reviewed changes

csrc/pos_encoding_kernels.cu Show resolved Hide resolved

csrc/pos_encoding_kernels.cu Show resolved Hide resolved

tests/kernels/test_pos_encoding.py Outdated Show resolved Hide resolved

add shape check

2ab9b2c

Signed-off-by: Isotr0py <[email protected]>

Isotr0py added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Kernel] Make rotary_embedding ops more flexible with input shape #12777

[Kernel] Make rotary_embedding ops more flexible with input shape #12777

Isotr0py commented Feb 5, 2025 •

edited

Loading

github-actions bot commented Feb 5, 2025

tlrmchlsmth left a comment

[Kernel] Make rotary_embedding ops more flexible with input shape #12777

Are you sure you want to change the base?

[Kernel] Make rotary_embedding ops more flexible with input shape #12777

Conversation

Isotr0py commented Feb 5, 2025 • edited Loading

github-actions bot commented Feb 5, 2025

tlrmchlsmth left a comment

Choose a reason for hiding this comment

Isotr0py commented Feb 5, 2025 •

edited

Loading