Skip to content

Commit 67bebe7

Browse files
committed
add a deprecation warning for float8 delayed and static scaling
Summary: As titled, the complexity tax for these features is high and there no known real use cases, as the community is overwhelmingly using dynamic scaling. So, IMO we should deprecate this. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 2fc91db ghstack-comment-id: 2641358141 Pull Request resolved: #1681
1 parent 8afd10e commit 67bebe7

File tree

2 files changed

+12
-0
lines changed

2 files changed

+12
-0
lines changed

torchao/float8/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,8 @@ for _ in range(10):
6565

6666
## float8 linear with delayed scaling
6767

68+
:warning: <em>We plan to deprecate delayed scaling in a future release, see https://github.com/pytorch/ao/issues/1680 for more details.</em>
69+
6870
This is theoretically the most performant recipe as it minimizes memory reads.
6971

7072
```python

torchao/float8/config.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,16 @@ def __post_init__(self):
304304
"When using FSDP, it's recommended to enable config.force_recompute_fp8_weight_in_bwd."
305305
)
306306

307+
# Future deprecation warning for delayed scaling
308+
if (
309+
self.cast_config_input.scaling_type != ScalingType.DYNAMIC
310+
or self.cast_config_weight.scaling_type != ScalingType.DYNAMIC
311+
or self.cast_config_grad_output.scaling_type != ScalingType.DYNAMIC
312+
):
313+
logger.warning(
314+
"Note: delayed and static scaling will be deprecated in a future release of torchao. Please see https://github.com/pytorch/ao/issues/1680 for more details."
315+
)
316+
307317

308318
# Pre-made recipes for common configurations
309319
# TODO(future PR): go through a round of design on this, and eventually expose

0 commit comments

Comments
 (0)