waybarrios / vllm-mlx Public

Notifications You must be signed in to change notification settings
Fork 174
Star 699

Code
Issues 38
Pull requests 53
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: waybarrios/vllm-mlx

Labels 12 Milestones 0

New pull request New

53 Open 106 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix: report prompt_tokens correctly for LLM models in SimpleEngine

#236 opened Mar 30, 2026 by sjswerdloff

Loading…

3 tasks

perf(reasoning): O(1) state-machine streaming parser (13-19x faster at 2k+ tokens)

#234 opened Mar 29, 2026 by penumbraforge

Loading…

Add TurboQuant KV cache compression for prefix cache (4.6x)

#233 opened Mar 29, 2026 by arozanov

Loading…

9 tasks done

fix: suppress tool call XML from streaming text content (#129)

#232 opened Mar 29, 2026 by sjswerdloff

Loading…

feat: add MiniMax tool call parsing support

#231 opened Mar 29, 2026 by sjswerdloff

Loading…

fix: add missing return in load_model_with_fallback

#230 opened Mar 29, 2026 by sjswerdloff

Loading…

fix: populate tokens field in BatchedEngine.generate()

#229 opened Mar 28, 2026 by mmcaulif

Loading…

3 tasks done

fix: bump mlx-lm minimum to 0.31.0 for hybrid model batching

#227 opened Mar 25, 2026 by krystophny

Loading…

test: make Python 3.13 async suite pass and cover it in CI

#226 opened Mar 25, 2026 by krystophny

Loading…

fix: MLLM hybrid batching + message normalization

#224 opened Mar 25, 2026 by Thump604

Loading…

feat: MTP per-request routing in BatchedEngine

#223 opened Mar 24, 2026 by Thump604

Loading…

2 of 3 tasks

simple-engine: keep tool chat on the streaming execution path

#222 opened Mar 24, 2026 by krystophny

Loading…

scheduler: preserve prompt checkpoints in chunked prefill resume path

#221 opened Mar 24, 2026 by krystophny

Loading…

engine: keep SimpleEngine serialized across cancellation

#220 opened Mar 24, 2026 by krystophny

Loading…

chat: forward chat_template_kwargs on simple-engine paths

#218 opened Mar 24, 2026 by krystophny

Loading…

prefix_cache: preserve hybrid recurrent state across blocks

#217 opened Mar 24, 2026 by krystophny

Loading…

cli: expose harmony and gpt-oss tool parsers

#216 opened Mar 24, 2026 by krystophny

Loading…

tokenizer: return successful mlx-lm load result

#215 opened Mar 24, 2026 by krystophny

Loading…

server: add OpenAI-compatible /v1/responses endpoint

#214 opened Mar 24, 2026 by krystophny

Loading…

feat: full sampling parameter support (top_k, min_p, presence_penalty, repetition_penalty)

#213 opened Mar 23, 2026 by Thump604

Loading…

5 tasks done

fix: respect tool_choice="none" by excluding tools from template

#210 opened Mar 23, 2026 by awanawana

Loading…

fix: Don’t truncate base64 images before hashing

#206 opened Mar 22, 2026 by BelieveDiffusion

Loading…

feat: add lifecycle-managed residency for the default server model

#205 opened Mar 22, 2026 by lyonsno

Loading…

fix: skip RNN snapshots in MTP optimistic mode to prevent memory leak

#196 opened Mar 21, 2026 by Thump604

Loading…

4 tasks

fix: streaming detokenizer for UTF-8-safe incremental decode

#195 opened Mar 21, 2026 by Thump604

Loading…

5 tasks

Previous 1 2 3 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!