Skip to content

Support power of 2 scaling factors in float8 training and use e4m3 everywhere #7876

Support power of 2 scaling factors in float8 training and use e4m3 everywhere

Support power of 2 scaling factors in float8 training and use e4m3 everywhere #7876

test (CUDA 2.5.1, linux.g5.12xlarge.nvidia.gpu, torch==2.5.1 --index-url https://download.pytorch...  /  linux-job

succeeded Feb 5, 2025 in 1h 12m 46s