Skip to content
This repository was archived by the owner on Aug 7, 2024. It is now read-only.

Commit df940ae

Browse files
committed
this is failing in backward for some reason with device context failure
1 parent 4526eb8 commit df940ae

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

float8_experimental/config.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,4 +20,7 @@
2020
# dynamic_use_activation_hooks = True
2121
# dynamic_use_activation_hooks = False
2222

23+
# This is a global flag that controls whether the fused_cast kernels,
24+
# This can offer greater performance in eager but it is still recommended
25+
# That if you are using torch.compile to set this to False.
2326
use_fused_cast = True

float8_experimental/float8_utils.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,6 @@ def amax_history_to_scale_stack(
7272
def tensor_to_amax(x, distributed_reduction=False):
7373
if float8_experimental.config.use_fused_cast and x.is_cuda:
7474
from float8_experimental.fused_kernels.fused_casting_kernels import abs_max
75-
7675
amax = abs_max(x)
7776
else:
7877
amax = x.abs().max()

0 commit comments

Comments
 (0)