Commit e76db70
authored
add axiswise scaling to Float8Linear (#920)
Summary:
This PR: support scaling of all arguments of all gemms to be axiswise,
and ensure that training with axiswise scaling works e2e.
Future PR: support more granular configurability and optimize
performance, add docs
Feel free to ignore the UX introduced in this PR, it's just an intermediate step. See next PR for the real UX.
Test Plan:
```
// tests pass
./test/float8/test_everything.sh
// sanity check on torchtitan with LLaMa 3 8B on 4 H100s with float8:
// 1. verify performance does not regress with tensorwise scaling
// 2. smoke test that axiswise scaling works and numerics are sane, performance isn't there though
// logs: https://gist.github.com/vkuzo/70fa5eb3c23375f307d11e7bae48682f
```
Reviewers:
Subscribers:
Tasks:
Tags:1 parent f81fe11 commit e76db70
File tree
9 files changed
+462
-55
lines changed- benchmarks/float8
- test/float8
- torchao/float8
9 files changed
+462
-55
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
18 | 23 | | |
19 | 24 | | |
20 | 25 | | |
| |||
107 | 112 | | |
108 | 113 | | |
109 | 114 | | |
| 115 | + | |
110 | 116 | | |
111 | 117 | | |
112 | 118 | | |
113 | 119 | | |
114 | 120 | | |
115 | 121 | | |
116 | 122 | | |
| 123 | + | |
117 | 124 | | |
118 | 125 | | |
119 | 126 | | |
120 | 127 | | |
121 | 128 | | |
| 129 | + | |
122 | 130 | | |
123 | 131 | | |
124 | | - | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
125 | 136 | | |
126 | 137 | | |
127 | 138 | | |
128 | 139 | | |
| 140 | + | |
129 | 141 | | |
130 | 142 | | |
131 | | - | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
132 | 147 | | |
133 | 148 | | |
134 | 149 | | |
135 | 150 | | |
| 151 | + | |
136 | 152 | | |
137 | 153 | | |
138 | | - | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
139 | 158 | | |
140 | 159 | | |
141 | 160 | | |
| |||
167 | 186 | | |
168 | 187 | | |
169 | 188 | | |
170 | | - | |
| 189 | + | |
171 | 190 | | |
172 | 191 | | |
173 | 192 | | |
| |||
310 | 329 | | |
311 | 330 | | |
312 | 331 | | |
| 332 | + | |
313 | 333 | | |
314 | 334 | | |
315 | 335 | | |
| |||
327 | 347 | | |
328 | 348 | | |
329 | 349 | | |
| 350 | + | |
| 351 | + | |
330 | 352 | | |
331 | 353 | | |
332 | 354 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
| 17 | + | |
16 | 18 | | |
17 | 19 | | |
18 | 20 | | |
| |||
75 | 77 | | |
76 | 78 | | |
77 | 79 | | |
| 80 | + | |
78 | 81 | | |
79 | 82 | | |
80 | 83 | | |
| |||
84 | 87 | | |
85 | 88 | | |
86 | 89 | | |
| 90 | + | |
87 | 91 | | |
88 | 92 | | |
89 | 93 | | |
| |||
109 | 113 | | |
110 | 114 | | |
111 | 115 | | |
112 | | - | |
113 | | - | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
114 | 123 | | |
115 | 124 | | |
116 | 125 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
26 | 31 | | |
27 | 32 | | |
28 | 33 | | |
| |||
252 | 257 | | |
253 | 258 | | |
254 | 259 | | |
| 260 | + | |
255 | 261 | | |
256 | 262 | | |
257 | 263 | | |
| |||
263 | 269 | | |
264 | 270 | | |
265 | 271 | | |
| 272 | + | |
266 | 273 | | |
267 | 274 | | |
268 | 275 | | |
269 | 276 | | |
270 | 277 | | |
| 278 | + | |
271 | 279 | | |
272 | 280 | | |
273 | | - | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
274 | 285 | | |
275 | 286 | | |
276 | 287 | | |
277 | 288 | | |
| 289 | + | |
278 | 290 | | |
279 | 291 | | |
280 | | - | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
281 | 296 | | |
282 | 297 | | |
283 | 298 | | |
284 | 299 | | |
| 300 | + | |
285 | 301 | | |
286 | 302 | | |
287 | | - | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
288 | 307 | | |
289 | 308 | | |
290 | 309 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
324 | 324 | | |
325 | 325 | | |
326 | 326 | | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
327 | 331 | | |
328 | 332 | | |
329 | 333 | | |
| |||
334 | 338 | | |
335 | 339 | | |
336 | 340 | | |
| 341 | + | |
337 | 342 | | |
338 | 343 | | |
339 | 344 | | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
340 | 355 | | |
341 | 356 | | |
342 | 357 | | |
343 | 358 | | |
344 | 359 | | |
345 | 360 | | |
| 361 | + | |
346 | 362 | | |
347 | 363 | | |
348 | 364 | | |
349 | | - | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
350 | 369 | | |
351 | 370 | | |
352 | 371 | | |
| 372 | + | |
353 | 373 | | |
354 | 374 | | |
355 | 375 | | |
356 | | - | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
357 | 380 | | |
358 | 381 | | |
359 | 382 | | |
| 383 | + | |
360 | 384 | | |
361 | 385 | | |
362 | 386 | | |
363 | | - | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
364 | 391 | | |
365 | 392 | | |
366 | 393 | | |
| |||
0 commit comments