Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[CPU][Zen] Route Int8 MoE inference through zentorch on AMD cpu Related to CPU backends gpt-oss Related to GPT-OSS models rocm Related to AMD ROCm v1
#44834 opened Jun 8, 2026 by ganeshr10 Contributor Draft
4 tasks
Bump the minor-update group across 1 directory with 148 updates ci/build dependencies Pull requests that update a dependency file nvidia rocm Related to AMD ROCm
#44832 opened Jun 8, 2026 by dependabot Bot Loading…
[Kernel][Perf] Tune fused_moe FP8 config for Qwen3-Next-80B tp=4 on H100 (+25% at batch 96-512) qwen Related to Qwen models
#44830 opened Jun 8, 2026 by qyYue1389 Contributor Loading…
Bump actions/checkout from 6.0.1 to 6.0.3 ci/build dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code
#44829 opened Jun 8, 2026 by dependabot Bot Loading…
[BugFix] Use served model name in gemma4 audio-tower error message bug Something isn't working multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed
#44828 opened Jun 8, 2026 by llsj14 Contributor Loading…
4 tasks
[ROCm][CI] Defer AITER sampler import and isolate server test PYTHONPATH rocm Related to AMD ROCm v1
#44823 opened Jun 8, 2026 by AndreasKaratzas Member Loading…
fix: prefix DeepSeek V4 MTP projections deepseek Related to DeepSeek models
#44821 opened Jun 8, 2026 by he-yufeng Contributor Loading…
[Bugfix][CI] Retry cached HF tokenizer load after transport failures bug Something isn't working
#44820 opened Jun 8, 2026 by AndreasKaratzas Member Loading…
[CI] Consolidate multimodal entrypoint tests. ci/build
#44819 opened Jun 8, 2026 by noooop Collaborator Loading…
4 tasks
[Perf] Add H200 BF16 fused MoE configs for Gemma4 (E=128,N=704)
#44818 opened Jun 8, 2026 by LucasWilkinson Collaborator Loading…
[Model Runner V2][Spec Decode]Support peagle spec decode needs-rebase v1
#44816 opened Jun 8, 2026 by wxsIcey Contributor Loading…
4 tasks
[Bugfix] Fix layerwise reload dropping params after a composed weight loader bug Something isn't working
#44814 opened Jun 7, 2026 by hallerite Contributor Loading…
[Bugfix] Replace sequential port scan with atomic port=0 in get_open_ports_list bug Something isn't working
#44813 opened Jun 7, 2026 by rguiu Loading…
3 of 4 tasks
fix: stop thinking budget at implicit tool calls qwen Related to Qwen models v1
#44812 opened Jun 7, 2026 by he-yufeng Contributor Loading…
fix: strip Gemma4 delimiters from dict keys tool-calling
#44811 opened Jun 7, 2026 by he-yufeng Contributor Loading…
[ROCm][CI] Re-route NixlConnector jobs ci/build kv-connector ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm v1
#44809 opened Jun 7, 2026 by AndreasKaratzas Member Loading…
[ROCm][gpt-oss] Hybrid CDNA4 swizzle gate for A8W4 MoE gpt-oss Related to GPT-OSS models rocm Related to AMD ROCm
#44804 opened Jun 7, 2026 by xiaohuguo2023 Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.