-
-
Notifications
You must be signed in to change notification settings - Fork 12.5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Model] Remove redundant None check in DeepSeekOCR image input processing
deepseek
Related to DeepSeek models
#32016
opened Jan 9, 2026 by
maang-h
Loading…
[Fix] Qwen3-VL-MoE bitsandbytes 4 bit quant
qwen
Related to Qwen models
#32013
opened Jan 9, 2026 by
Datta0
Loading…
1 of 5 tasks
[MISC] Add strict contiguity check for FlashInfer attention tensors
nvidia
v1
#32008
opened Jan 9, 2026 by
vadiklyutiy
Loading…
Reduce the kernel overhead when num of active loras is smaller than max loras. Multiple cuda graphs are captured for each num of active-loras.
nvidia
v1
#32005
opened Jan 9, 2026 by
yugong333
Loading…
5 tasks
fused_moe_kernel - cast accumulator after applying router weights
#32002
opened Jan 9, 2026 by
gnovack
Loading…
[ROCm][CI][V1] Fix ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
speculative-decoding
v1
nixl_connector test failure and achieve CUDA parity in test_async_scheduling
kv-connector
nvidia
ready
#32000
opened Jan 9, 2026 by
AndreasKaratzas
Loading…
Fix type error
fb-exported
frontend
meta-exported
ready
ONLY add when PR is ready to merge/full CI is needed
#31999
opened Jan 8, 2026 by
Adolfo-Karim
Loading…
[Misc] Enable async scheduling by default with spec decoding
ready
ONLY add when PR is ready to merge/full CI is needed
#31998
opened Jan 8, 2026 by
njhill
Loading…
[ROCM] Add ROCm image build to release pipeline
ci/build
rocm
Related to AMD ROCm
#31995
opened Jan 8, 2026 by
dllehr-amd
Loading…
5 tasks
fix lora moe sharding when rank < max_lora_rank
gpt-oss
Related to GPT-OSS models
ready
ONLY add when PR is ready to merge/full CI is needed
#31994
opened Jan 8, 2026 by
gnovack
Loading…
[Misc][PD] Fix
get_attn_backend usage in transfer connectors
kv-connector
#31988
opened Jan 8, 2026 by
NickLucche
Loading…
[Feature][#29390]: Add timeout support to MultiprocExecutor.collective_rpc and FutureWrapper
v1
#31986
opened Jan 8, 2026 by
SandishKumarHN
Loading…
5 tasks
[Kernel] Optimize Sliding Window Attention in 3D Triton Kernel
#31984
opened Jan 8, 2026 by
jvlunteren
Loading…
[Bugfix] Fix Fp8 Triton for non-gated MoE (Nemotron)
#31983
opened Jan 8, 2026 by
danisereb
Loading…
5 tasks
[Misc] Clean up world_size > avail_gpu warning for ray
v1
#31981
opened Jan 8, 2026 by
ruisearch42
Loading…
5 tasks
[Model] Reorganize pooling layers
ci/build
documentation
Improvements or additions to documentation
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#31973
opened Jan 8, 2026 by
DarkLight1337
Loading…
5 tasks
[CPU] Add head sizes 80 and 112 with vec16 fallback
cpu
Related to CPU backends
v1
#31968
opened Jan 8, 2026 by
R3hankhan123
Loading…
5 tasks
[KVConnector] Support worker -> scheduler metadata
kv-connector
v1
#31964
opened Jan 8, 2026 by
orozery
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.