Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add second version of Qwen 3.5 chat template to chat_template_utils
#5405 opened Mar 30, 2026 by apardyl Loading…
3 of 8 tasks
Remove xfail for Qwen3VL CI tests
#5402 opened Mar 30, 2026 by albertvillanova Loading…
Add per-sample tool filtering to GRPOTrainer via tools column
#5398 opened Mar 27, 2026 by lailanelkoussy Loading…
3 tasks done
Better test consistency RLOO vs GRPO
#5396 opened Mar 27, 2026 by qgallouedec Loading…
Add tool calling support to RLOOTrainer
#5395 opened Mar 27, 2026 by qgallouedec Loading…
Update "What's New": TRL v1 blog post
#5385 opened Mar 27, 2026 by qgallouedec Loading…
Remove xfail for ZeRO 2 and 3 + SFT + PEFT test
#5383 opened Mar 27, 2026 by qgallouedec Loading…
Fix DAPO token-level loss to use prompt-level aggregation
#5381 opened Mar 26, 2026 by matdou Loading…
2 of 5 tasks
Remove truncation_mode from DPO
#5372 opened Mar 25, 2026 by albertvillanova Loading…
add more generaic device suppport for CI tests
#5357 opened Mar 24, 2026 by kaixuanliu Loading…
Enable Tensor Parallelism in SFT script
#5331 opened Mar 21, 2026 by songhappy Loading…
(5/5) async grpo metrics
#5322 opened Mar 20, 2026 by AmineDiro Loading…
(3/5) Cancel Stale inflight tasks
#5320 opened Mar 20, 2026 by AmineDiro Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.