AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs

Overview

In this work, we introduce AutoTriton, the first model dedicated to Triton programming powered by reinforcement learning (RL). AutoTriton performs supervised fine-tuning (SFT) to be equipped with essential Triton programming expertise using a high-quality data gathering pipeline, and conducts RL with Group Relative Policy Optimization (GRPO) algorithm, combining a rule-based reward and an execution-based reward to further improve Triton programming ability, sequentially. Experiments across five evaluation channels of TritonBench and KernelBench illustrate that our 8B model AutoTriton achieves performance comparable to mainstream large models, including Claude-4-Sonnet and DeepSeek-R1-0528.

Model Use

We provide the model weights of AutoTriton, which is trained based on Seed-Coder-8B-Reasoning.

Contact

For any questions, you can contact qshi9510@gmail.com.

Citation

If you find this work useful, consider giving this repository a star ⭐️ and citing 📝 our paper as follows:

@article{li2025autotriton,
  title={AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs},
  author={Li, Shangzhan and Wang, Zefan and He, Ye and Li, Yuxuan and Shi, Qi and Li, Jianling and Hu, Yonggang and Che, Wanxiang and Han, Xu and Liu, Zhiyuan and others},
  journal={arXiv preprint arXiv:2507.05687},
  year={2025}
}

Acknowledgement

The work is initiated and supported by the AI9Stars Team. We are grateful for the support of the OpenBMB and InfiniteTensor teams.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs

Overview

Model Use

Contact

Citation

Acknowledgement

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs

Overview

Model Use

Contact

Citation

Acknowledgement