Skip to content

Commit 78b0d60

Browse files
authored
update README (#1121)
This PR updates `README.md` to reflect recent changes in contents and directions. Hope to land before ICLR starts (in a week).
1 parent 5ae50d0 commit 78b0d60

File tree

3 files changed

+38
-20
lines changed

3 files changed

+38
-20
lines changed

README.md

+37-19
Original file line numberDiff line numberDiff line change
@@ -2,32 +2,47 @@
22

33
# torchtitan
44

5-
#### A PyTorch native library for large-scale model training
5+
#### A PyTorch native platform for training generative AI models
66

77
[![integration tests](https://github.com/pytorch/torchtitan/actions/workflows/integration_test_8gpu.yaml/badge.svg?branch=main)](https://github.com/pytorch/torchtitan/actions/workflows/integration_test_8gpu.yaml?query=branch%3Amain)
88
[![arXiv](https://img.shields.io/badge/arXiv-2410.06511-b31b1b.svg)](https://arxiv.org/abs/2410.06511)
9-
[![docs](https://img.shields.io/badge/docs-latest-blue.svg)](docs/)
9+
[![ICLR](https://img.shields.io/badge/ICLR-2025-blue.svg)](https://iclr.cc/virtual/2025/poster/29620)
1010
[![forum](https://img.shields.io/badge/pytorch-forum-DE3412.svg)](https://discuss.pytorch.org/c/distributed/torchtitan/44)
1111
[![license](https://img.shields.io/badge/license-BSD_3--Clause-lightgrey.svg)](./LICENSE)
1212

1313
</div>
1414

15-
`torchtitan` is currently in a pre-release state and under extensive development. Currently we showcase pre-training **Llama 3.1** LLMs of various sizes from scratch. To use the latest features of `torchtitan`, we recommend using the most recent PyTorch nightly.
15+
`torchtitan` is currently in a pre-release state and under extensive development. We showcase training Llama 3.1 LLMs at scale, and are working on other types of generative AI models, including LLMs with MoE architectures, multimodal LLMs, and diffusion models, in the [`experiments`](torchtitan/experiments) folder.
16+
To use the latest features of `torchtitan`, we recommend using the most recent PyTorch nightly.
17+
18+
19+
## Latest News
20+
- [2025/04] Our paper has been accepted by [ICLR 2025](https://iclr.cc/virtual/2025/poster/29620). The poster will be presented on Friday April 25th.
21+
- [2025/04] [Llama 4](torchtitan/experiments/llama4/) initial support is available as an experiment.
22+
- [2025/04] Training the diffusion model [FLUX](torchtitan/experiments/flux/) with FSDP/HSDP is available as an experiment.
23+
- [2025/04] The frontend implementation of [SimpleFSDP](torchtitan/experiments/simple_fsdp/), a compiler-based FSDP framework, is available as an experiment.
24+
- [2024/12] GPU MODE [lecture](https://www.youtube.com/watch?v=VYWRjcUqW6w) on torchtitan.
25+
- [2024/11] [Presentation](https://www.alluxio.io/videos/ai-ml-infra-meetup-torchtitan-one-stop-pytorch-native-solution-for-production-ready-llm-pre-training) at an AI/ML Infra Meetup.
26+
- [2024/07] [Presentation](https://pytorch2024.sched.com/event/1fHn3) at PyTorch Conference 2024.
27+
- [2024/04] [Intro video](https://youtu.be/ee5DOEqD35I?si=_B94PbVv0V5ZnNKE) - learn more about `torchtitan` in under 4 minutes.
28+
1629

1730
## Overview
1831

19-
`torchtitan` is a proof-of-concept for large-scale LLM training using native PyTorch. It is (and will continue to be) a repo to showcase PyTorch's latest distributed training features in a clean, minimal codebase. `torchtitan` is complementary to and not a replacement for any of the great large-scale LLM training codebases such as Megatron, MegaBlocks, LLM Foundry, DeepSpeed, etc. Instead, we hope that the features showcased in `torchtitan` will be adopted by these codebases quickly. `torchtitan` is unlikely to ever grow a large community around it.
32+
`torchtitan` is a PyTorch native platform designed for **rapid experimentation and large-scale training** of generative AI models. As a minimal clean-room implementation of PyTorch native scaling techniques, `torchtitan` provides a flexible foundation for developers to build upon. With `torchtitan` [extension points](docs/extension.md), one can easily create custom extensions tailored to specific needs.
2033

21-
Our guiding principles when building `torchtitan`:
34+
Our mission is to accelerate innovation in the field of generative AI by empowering researchers and developers to explore new modeling architectures and infrastructure techniques.
2235

36+
The guiding principles when building `torchtitan`
2337
* Designed to be easy to understand, use and extend for different training purposes.
2438
* Minimal changes to the model code when applying multi-dimensional parallelism.
25-
* Modular components instead of a monolithic codebase.
26-
* Get started in minutes, not hours!
39+
* Bias towards a clean, minimal codebase while providing basic reusable / swappable components.
40+
41+
`torchtitan` has been showcasing PyTorch's latest distributed training features, via pretraining Llama 3.1 LLMs of various sizes.
42+
To accelerate contributions to and innovations around torchtitan, we are hosting a new [`experiments`](torchtitan/experiments) folder. We look forward to your contributions!
2743

28-
### Intro video - learn more about `torchtitan` in under 4 mins
2944

30-
[![Welcome to torchtitan!](assets/images/titan_play_video.png)](https://youtu.be/ee5DOEqD35I?si=_B94PbVv0V5ZnNKE "Welcome to torchtitan!")
45+
## Llama 3.1 pretraining
3146

3247
### Key features available
3348

@@ -37,7 +52,7 @@ Our guiding principles when building `torchtitan`:
3752
- [Pipeline Parallel](https://discuss.pytorch.org/t/distributed-w-torchtitan-training-with-zero-bubble-pipeline-parallelism/214420)
3853
- [Context Parallel](https://discuss.pytorch.org/t/distributed-w-torchtitan-breaking-barriers-training-long-context-llms-with-1m-sequence-length-in-pytorch-using-context-parallel/215082)
3954
2. [Meta device](https://pytorch.org/docs/stable/meta.html) initialization
40-
3. Selective (layer or operator) activation checkpointing
55+
3. Selective (layer or operator) and full activation checkpointing
4156
4. [Distributed checkpointing](https://discuss.pytorch.org/t/distributed-w-torchtitan-optimizing-checkpointing-efficiency-with-pytorch-dcp/211250) (including async checkpointing)
4257
- [Interoperable checkpoints](docs/checkpoint.md) which can be loaded directly into [`torchtune`](https://github.com/pytorch/torchtune) for fine-tuning
4358
5. `torch.compile` support
@@ -115,21 +130,24 @@ srun torchrun --nnodes 2
115130

116131
If your gpu count per node is not 8, adjust `--nproc_per_node` in the torchrun command and `#SBATCH --gpus-per-task` in the SBATCH command section.
117132

133+
118134
## Citation
119135

120-
We provide a detailed look into the parallelisms and optimizations available in `torchtitan`, along with summary advice on when to use various techniques: [TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training](https://arxiv.org/abs/2410.06511).
136+
We provide a detailed look into the parallelisms and optimizations available in `torchtitan`, along with summary advice on when to use various techniques.
137+
138+
[TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training](https://openreview.net/forum?id=SFN6Wm7YBI)
121139
```
122-
@misc{torchtitan,
123-
title={TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training},
124-
author={Wanchao Liang and Tianyu Liu and Less Wright and Will Constable and Andrew Gu and Chien-Chin Huang and Iris Zhang and Wei Feng and Howard Huang and Junjie Wang and Sanket Purandare and Gokul Nadathur and Stratos Idreos},
125-
year={2024},
126-
eprint={2410.06511},
127-
archivePrefix={arXiv},
128-
primaryClass={cs.CL},
129-
url={https://arxiv.org/abs/2410.06511},
140+
@inproceedings{
141+
liang2025torchtitan,
142+
title={TorchTitan: One-stop PyTorch native solution for production ready {LLM} pretraining},
143+
author={Wanchao Liang and Tianyu Liu and Less Wright and Will Constable and Andrew Gu and Chien-Chin Huang and Iris Zhang and Wei Feng and Howard Huang and Junjie Wang and Sanket Purandare and Gokul Nadathur and Stratos Idreos},
144+
booktitle={The Thirteenth International Conference on Learning Representations},
145+
year={2025},
146+
url={https://openreview.net/forum?id=SFN6Wm7YBI}
130147
}
131148
```
132149

150+
133151
## License
134152

135153
Source code is made available under a [BSD 3 license](./LICENSE), however you may have other legal obligations that govern your use of other content linked in this repository, such as the license or terms of service for third-party data and models.

assets/images/titan_play_video.png

-993 KB
Binary file not shown.

docs/extension.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
To support quick experimentation with torchtitan, we provide several extension points. The principle for adding these extension points is to support various use cases with flexible component swapping and reuse, while trying to keep the code clean and minimal.
1+
To support rapid experimentation with torchtitan, we provide several extension points. The principle for adding these extension points is to support various use cases with flexible component swapping and reuse, while trying to keep the code clean and minimal.
22

33
The extension points and protocols mentioned in this note are subject to change.
44

0 commit comments

Comments
 (0)