Skip to content

Pull requests: waybarrios/vllm-mlx

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add TurboQuant KV cache compression for prefix cache (4.6x)
#233 opened Mar 29, 2026 by arozanov Loading…
9 tasks done
feat: add MiniMax tool call parsing support
#231 opened Mar 29, 2026 by sjswerdloff Loading…
fix: add missing return in load_model_with_fallback
#230 opened Mar 29, 2026 by sjswerdloff Loading…
fix: populate tokens field in BatchedEngine.generate()
#229 opened Mar 28, 2026 by mmcaulif Loading…
3 tasks done
feat: MTP per-request routing in BatchedEngine
#223 opened Mar 24, 2026 by Thump604 Loading…
2 of 3 tasks
cli: expose harmony and gpt-oss tool parsers
#216 opened Mar 24, 2026 by krystophny Loading…
tokenizer: return successful mlx-lm load result
#215 opened Mar 24, 2026 by krystophny Loading…
server: add OpenAI-compatible /v1/responses endpoint
#214 opened Mar 24, 2026 by krystophny Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.