huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 1.7k
Star 12.8k

Code
Issues 325
Pull requests 75
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 32 Milestones 0

New pull request New

75 Open 1,397 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

🏃 Faster CI

#3160 opened Mar 25, 2025 by qgallouedec

Loading…

5 tasks

👨‍🍳 vLLM serve: destroy process group on exit and pass worker_cls as string

#3159 opened Mar 25, 2025 by qgallouedec

Loading…

💰 Richer rich table - log all the rewards

#3156 opened Mar 24, 2025 by qgallouedec

Loading…

5 tasks

fix: handle None inputs when resuming GRPO Trainer from checkpoint

#3148 opened Mar 24, 2025 by PenutChen

Loading…

4 tasks done

Fix: Compatibility for formatting_func returning a list

#3147 opened Mar 24, 2025 by YeFD

Loading…

4 of 5 tasks

Fix length bias for Dr GRPO

#3138 opened Mar 23, 2025 by idoru

Loading…

5 tasks

Extend BCO Trainer dataset format support

#3134 opened Mar 22, 2025 by reihig-ut

Loading…

1 of 5 tasks

Add GRPO/ Online DPO support for quantitative models when use vllm as infer backbone.

#3133 opened Mar 22, 2025 by maoulee

Loading…

💎 Gemma 3 VLM SFT example script for single-image and multi-image

#3131 opened Mar 21, 2025 by sergiopaniego

Loading…

5 tasks

improvement(utils.py): simplify repeating completion string

#3122 opened Mar 20, 2025 by tpoisonooo

Loading…

feat: Add Interleaved Trainer implementation

#3107 opened Mar 18, 2025 by ucalyptus2

Loading…

3 tasks done

Co-Locating vLLM Instances with Training Processes Via External Launcher

#3105 opened Mar 18, 2025 by toslali-ibm

Loading…

2 of 5 tasks

Update sft trainer to include better packing

#3100 opened Mar 17, 2025 by Ishan-Kumar2

Loading…

4 tasks done

add cli dict parsing for grpo_config

#3082 opened Mar 14, 2025 by Tavish9 • Draft

2 of 5 tasks

[GRPO] add vlm training capabilities to the trainer

#3072 opened Mar 13, 2025 by CompN3rd

Loading…

3 of 5 tasks

[WIP] PEFT 🤝 Liger DPO

#3065 opened Mar 12, 2025 by SalmanMohammadi • Draft

5 tasks

Static cache GRPO

#3023 opened Mar 7, 2025 by qgallouedec • Draft

5 tasks

[WIP] Iterative training scripts for SPIN and SPPO

#3011 opened Mar 5, 2025 by jkx19 • Draft

3 of 5 tasks

Fixing GRPO reward_func being a model with DeepSpeed ZeRO-3

#2984 opened Feb 28, 2025 by jamesbraza

Loading…

Feature: Add SGLang as inference backend for generation in GRPO

#2981 opened Feb 28, 2025 by jhinpan

Loading…

5 tasks done

Support ReMax Algorithm

#2955 opened Feb 25, 2025 by liziniu

Loading…

3 tasks done

[Models] Activation checkpointing from TrorchTune

#2954 opened Feb 25, 2025 by kashif

Loading…

Agents

#2936 opened Feb 23, 2025 by August-murr

Loading…

Add the metrics completion_length_max and completion_length_min

#2930 opened Feb 22, 2025 by dignfei

Loading…

4 tasks

Liger GRPO support

#2926 opened Feb 21, 2025 by SalmanMohammadi • Draft

4 tasks

Previous 1 2 3 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly