Commit 2937167
committed
Add support of DDP and experimental CompiledAutograd
Summary:
Address the comments in #319 and resubmit the PR to fit the current code base.
Test Plan:
```
CONFIG_FILE=./train_configs/debug_model.toml ./run_llama_train.sh --comm.train_timeout_seconds=3600 --training.tensor_parallel_degree=1 --training.data_parallel_degree=8 --experimental.data_parallel_type=ddp --training.steps=1000 --metrics.log_freq=10 --profiling.profile_freq=1000
```
ghstack-source-id: 81dc85d
Pull Request resolved: #4321 parent 4a2de42 commit 2937167
File tree
6 files changed
+79
-8
lines changed- torchtitan
- parallelisms
6 files changed
+79
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
| 74 | + | |
74 | 75 | | |
75 | 76 | | |
76 | 77 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
304 | 304 | | |
305 | 305 | | |
306 | 306 | | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
307 | 316 | | |
308 | 317 | | |
309 | 318 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
312 | 312 | | |
313 | 313 | | |
314 | 314 | | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
315 | 326 | | |
316 | 327 | | |
317 | 328 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
| 34 | + | |
33 | 35 | | |
34 | 36 | | |
35 | 37 | | |
| |||
42 | 44 | | |
43 | 45 | | |
44 | 46 | | |
| 47 | + | |
45 | 48 | | |
46 | 49 | | |
47 | 50 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
19 | 21 | | |
20 | 22 | | |
21 | 23 | | |
| |||
453 | 455 | | |
454 | 456 | | |
455 | 457 | | |
456 | | - | |
| 458 | + | |
457 | 459 | | |
458 | 460 | | |
459 | 461 | | |
460 | 462 | | |
461 | 463 | | |
462 | | - | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
463 | 467 | | |
464 | 468 | | |
465 | 469 | | |
| |||
492 | 496 | | |
493 | 497 | | |
494 | 498 | | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
495 | 522 | | |
496 | 523 | | |
497 | 524 | | |
| |||
516 | 543 | | |
517 | 544 | | |
518 | 545 | | |
519 | | - | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
520 | 550 | | |
521 | 551 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
138 | 138 | | |
139 | 139 | | |
140 | 140 | | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
141 | 157 | | |
142 | 158 | | |
143 | 159 | | |
| |||
160 | 176 | | |
161 | 177 | | |
162 | 178 | | |
| 179 | + | |
163 | 180 | | |
164 | 181 | | |
165 | 182 | | |
| |||
194 | 211 | | |
195 | 212 | | |
196 | 213 | | |
197 | | - | |
198 | | - | |
199 | | - | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
200 | 217 | | |
201 | 218 | | |
202 | 219 | | |
| |||
364 | 381 | | |
365 | 382 | | |
366 | 383 | | |
367 | | - | |
| 384 | + | |
368 | 385 | | |
369 | 386 | | |
370 | 387 | | |
| |||
381 | 398 | | |
382 | 399 | | |
383 | 400 | | |
384 | | - | |
| 401 | + | |
385 | 402 | | |
386 | 403 | | |
387 | 404 | | |
| |||
0 commit comments