generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add second version of Qwen 3.5 chat template to chat_template_utils
#5405
opened Mar 30, 2026 by
apardyl
Loading…
3 of 8 tasks
Mark as strict the xfail tests with zero3 for RLOO and GRPO
#5404
opened Mar 30, 2026 by
albertvillanova
Loading…
Fix flaky CI test_rloo[fsdp2]: Replace non-deterministic xfail with skipif for transformers 5.4.0
#5403
opened Mar 30, 2026 by
albertvillanova
Loading…
Fix CI slow-tests cannot remove: No such file or directory
#5401
opened Mar 30, 2026 by
albertvillanova
Loading…
Add per-sample tool filtering to GRPOTrainer via
tools column
#5398
opened Mar 27, 2026 by
lailanelkoussy
Loading…
3 tasks done
Add HF_TOKEN environment variable to workflow files
#5397
opened Mar 27, 2026 by
qgallouedec
Loading…
feat(grpo): add stop_tool_names for immediate agent loop termination
#5390
opened Mar 27, 2026 by
lailanelkoussy
Loading…
Fix DAPO token-level loss to use prompt-level aggregation
#5381
opened Mar 26, 2026 by
matdou
Loading…
2 of 5 tasks
[vllm-serve] Add extra_llm_kwargs for passing additional arguments to vllm.LLM()
#5367
opened Mar 25, 2026 by
jonahsamost
Loading…
1 of 5 tasks
Add chunked LM head for memory-efficient log-prob computation for AsyncGRPOTrainer
#5349
opened Mar 23, 2026 by
AmineDiro
Loading…
[Test] Fix *test_training_vlm_multi_image* by skipping vision params in assertion
#5341
opened Mar 22, 2026 by
YangKai0616
Loading…
Fix Liger kernel crash with device_map="auto" on multi-GPU in GRPOTrainer
#5340
opened Mar 22, 2026 by
YangKai0616
Loading…
Support multimodal tool responses in
environment_factory for VLM training
#5323
opened Mar 20, 2026 by
sergiopaniego
Loading…
5 tasks
(4/5) async grpo break out of generation loop (is_done)
#5321
opened Mar 20, 2026 by
AmineDiro
Loading…
(1/5) Add callback to sync weights before training begins
#5319
opened Mar 20, 2026 by
AmineDiro
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.