-
-
Notifications
You must be signed in to change notification settings - Fork 17.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix: Populate prompt_tokens_details.cached_tokens in Rust frontend Usage
rust
#44833
opened Jun 8, 2026 by
charukeas-t
Loading…
Bump the minor-update group across 1 directory with 148 updates
ci/build
dependencies
Pull requests that update a dependency file
nvidia
rocm
Related to AMD ROCm
#44832
opened Jun 8, 2026 by
dependabot
Bot
Loading…
P2P NCCL connector: add hybrid SSM/Mamba model support
kv-connector
#44831
opened Jun 8, 2026 by
KonnyakuMatcha
Loading…
3 of 4 tasks
[Kernel][Perf] Tune fused_moe FP8 config for Qwen3-Next-80B tp=4 on H100 (+25% at batch 96-512)
qwen
Related to Qwen models
#44830
opened Jun 8, 2026 by
qyYue1389
Contributor
Loading…
Bump actions/checkout from 6.0.1 to 6.0.3
ci/build
dependencies
Pull requests that update a dependency file
github_actions
Pull requests that update GitHub Actions code
#44829
opened Jun 8, 2026 by
dependabot
Bot
Loading…
[BugFix] Use served model name in gemma4 audio-tower error message
bug
Something isn't working
multi-modality
Related to multi-modality (#4194)
ready
ONLY add when PR is ready to merge/full CI is needed
#44828
opened Jun 8, 2026 by
llsj14
Contributor
Loading…
4 tasks
[Platform] Replace Related to Intel GPU
kv-connector
nvidia
v1
torch.cuda.mem_get_info with torch.accelerator.mem_get_info
intel-gpu
#44825
opened Jun 8, 2026 by
jikunshang
Member
•
Draft
4 tasks
[ROCm][CI] Defer AITER sampler import and isolate server test PYTHONPATH
rocm
Related to AMD ROCm
v1
#44823
opened Jun 8, 2026 by
AndreasKaratzas
Member
Loading…
Feature/cache accounting OpenAI anthropic api
frontend
#44822
opened Jun 8, 2026 by
ajayr4j
Loading…
fix: prefix DeepSeek V4 MTP projections
deepseek
Related to DeepSeek models
#44821
opened Jun 8, 2026 by
he-yufeng
Contributor
Loading…
[Bugfix][CI] Retry cached HF tokenizer load after transport failures
bug
Something isn't working
#44820
opened Jun 8, 2026 by
AndreasKaratzas
Member
Loading…
[CI] Consolidate multimodal entrypoint tests.
ci/build
#44819
opened Jun 8, 2026 by
noooop
Collaborator
Loading…
4 tasks
[Perf] Add H200 BF16 fused MoE configs for Gemma4 (E=128,N=704)
#44818
opened Jun 8, 2026 by
LucasWilkinson
Collaborator
Loading…
[Model Runner V2][Spec Decode]Support peagle spec decode
needs-rebase
v1
#44816
opened Jun 8, 2026 by
wxsIcey
Contributor
Loading…
4 tasks
[Bugfix] Fix layerwise reload dropping params after a composed weight loader
bug
Something isn't working
#44814
opened Jun 7, 2026 by
hallerite
Contributor
Loading…
[Bugfix] Replace sequential port scan with atomic port=0 in get_open_ports_list
bug
Something isn't working
#44813
opened Jun 7, 2026 by
rguiu
Loading…
3 of 4 tasks
fix: stop thinking budget at implicit tool calls
qwen
Related to Qwen models
v1
#44812
opened Jun 7, 2026 by
he-yufeng
Contributor
Loading…
fix: strip Gemma4 delimiters from dict keys
tool-calling
#44811
opened Jun 7, 2026 by
he-yufeng
Contributor
Loading…
[ROCm][CI] Re-route NixlConnector jobs
ci/build
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
v1
#44809
opened Jun 7, 2026 by
AndreasKaratzas
Member
Loading…
[ROCm][gpt-oss] Hybrid CDNA4 swizzle gate for A8W4 MoE
gpt-oss
Related to GPT-OSS models
rocm
Related to AMD ROCm
#44804
opened Jun 7, 2026 by
xiaohuguo2023
Loading…
[Core][Spec Decode] SSD: async verify/draft overlap via OutcomePredictor (closes #36037)
speculative-decoding
v1
#44802
opened Jun 7, 2026 by
buddywhitman
Loading…
3 of 4 tasks
[Rust Frontend]: Add
/get_world_size route with static parallel size
rust
v1
#44801
opened Jun 7, 2026 by
coder3101
Loading…
4 tasks
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.