integrate new float8 quantization primitives into AQT #1598

danielvegamyhre · 2025-01-22T19:27:18Z

Context

Currently, AQT has the method from_hp_to_floatx for float8 quantization, and from_hp_to_fpx for low precision floating point data types like fp6 (technically can support fp1-fp7).

from_hp_to_floatx re-uses from_hp_to_intx, which in turn uses these generic quantization primitives.

Overall, in the current state the float8 path is a bit confusing for developers, due to both the naming ("floatx") and the use of generic functions which include a bunch of params which are unrelated to float8 quantization.

Summary of changes

The goal of this PR stack is to refactor this to have a clean separation of concerns, and simpler internal API surfaces for code using in float8 quantization for inference.

Specifically:

Separate quantization primitives for float8
Integrate those new quant primitives into AQT <------------------- (this PR)

Note: I will add float8 static quantization in a separate set of PRs.

[ghstack-poisoned]

danielvegamyhre · 2025-01-22T19:27:18Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2025-01-22T19:27:21Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1598

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 2cac42e with merge base 860da26 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 9aacd39 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

vkuzo · 2025-01-23T17:49:29Z

torchao/dtypes/affine_quantized_tensor.py

+            target_dtype,
+        )
+        fp8_data = _layout.post_process(fp8_data)
+        tensor_impl_ctr = get_tensor_impl_constructor(type(_layout))


does this have multiple options for float8? if not, just call it directly to reduce # of abstractions the code reader needs to know about

good point, done!

vkuzo · 2025-01-23T17:51:04Z

torchao/dtypes/affine_quantized_tensor.py

+        target_dtype: torch.dtype,
+        block_size: Tuple[int, ...],
+        _layout: Layout = PlainLayout(),
+    ):


a docblock here should explain the difference between from_hp_to_floatx, from_hp_to_fpx, from_hp_to_float8

ghstack-source-id: 9aacd39 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

[ghstack-poisoned]

ghstack-source-id: c87842f ghstack-comment-id: 2608090492 Pull Request resolved: #1598

jainapurva · 2025-01-23T23:05:06Z

torchao/dtypes/affine_quantized_tensor.py

@@ -422,6 +417,39 @@ def from_hp_to_fpx(
        tensor_impl = tensor_impl_ctr(floatx_packed, scale, None, _layout)
        return cls(tensor_impl, block_size, original_shape, dtype=input_float.dtype)

+    @classmethod
+    def from_hp_to_float8(


Update from_hp_to_floatx with the new float8 logic. For fp1-fp7, we're using from_hp_to_floatx.

[ghstack-poisoned]

ghstack-source-id: c1deeeb ghstack-comment-id: 2608090492 Pull Request resolved: #1598

[ghstack-poisoned]

ghstack-source-id: c1deeeb ghstack-comment-id: 2608090492 Pull Request resolved: #1598

[ghstack-poisoned]

ghstack-source-id: 982ea07 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

[ghstack-poisoned]

ghstack-source-id: e0a6b79 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

[ghstack-poisoned]

ghstack-source-id: fccae98 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

[ghstack-poisoned]

ghstack-source-id: 2d1d4f1 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

[ghstack-poisoned]

ghstack-source-id: 0a9fcdb ghstack-comment-id: 2608090492 Pull Request resolved: #1598

[ghstack-poisoned]

ghstack-source-id: a098c0e ghstack-comment-id: 2608090492 Pull Request resolved: #1598

[ghstack-poisoned]

ghstack-source-id: b9139dc ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

ab8d5b5

[ghstack-poisoned]

danielvegamyhre mentioned this pull request Jan 22, 2025

add separate quantization primitives for float8 #1597

Merged

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 22, 2025

danielvegamyhre added a commit that referenced this pull request Jan 22, 2025

integrate new float8 quantization primitives into AQT

65ceb64

ghstack-source-id: 9aacd39 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

danielvegamyhre added quantize topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) labels Jan 22, 2025

danielvegamyhre mentioned this pull request Jan 22, 2025

Add Float8QuantizedTensor (AQT subclass) and replace to_affine_quantized_floatx with to_affine_quantized_float8 in quantization APIs #1599

Closed

danielvegamyhre requested review from jainapurva and jerryzh168 and removed request for jainapurva January 22, 2025 19:36

vkuzo reviewed Jan 23, 2025

View reviewed changes

danielvegamyhre added a commit that referenced this pull request Jan 23, 2025

integrate new float8 quantization primitives into AQT

cf84d4a

ghstack-source-id: 9aacd39 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

76c7bde

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 23, 2025

integrate new float8 quantization primitives into AQT

02e82f3

ghstack-source-id: c87842f ghstack-comment-id: 2608090492 Pull Request resolved: #1598

jainapurva reviewed Jan 23, 2025

View reviewed changes

Update

e116ed8

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 23, 2025

integrate new float8 quantization primitives into AQT

7cc4944

ghstack-source-id: c1deeeb ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

2888180

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 23, 2025

integrate new float8 quantization primitives into AQT

a4cfe7d

ghstack-source-id: c1deeeb ghstack-comment-id: 2608090492 Pull Request resolved: #1598

danielvegamyhre added a commit that referenced this pull request Jan 23, 2025

integrate new float8 quantization primitives into AQT

17f0d20

ghstack-source-id: c1deeeb ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

6ba7924

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 24, 2025

integrate new float8 quantization primitives into AQT

be2d094

ghstack-source-id: 982ea07 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

7cd5bc4

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 24, 2025

integrate new float8 quantization primitives into AQT

f419cac

ghstack-source-id: e0a6b79 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

f8eae87

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 24, 2025

integrate new float8 quantization primitives into AQT

0dbe010

ghstack-source-id: fccae98 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

9444e47

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 24, 2025

integrate new float8 quantization primitives into AQT

afc2779

ghstack-source-id: 2d1d4f1 ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

e614e66

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 24, 2025

integrate new float8 quantization primitives into AQT

41357fa

ghstack-source-id: 0a9fcdb ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

20476f9

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 24, 2025

integrate new float8 quantization primitives into AQT

9a1c85a

ghstack-source-id: a098c0e ghstack-comment-id: 2608090492 Pull Request resolved: #1598

Update

2cac42e

[ghstack-poisoned]

danielvegamyhre added a commit that referenced this pull request Jan 24, 2025

integrate new float8 quantization primitives into AQT

d24b70e

ghstack-source-id: b9139dc ghstack-comment-id: 2608090492 Pull Request resolved: #1598

danielvegamyhre closed this Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integrate new float8 quantization primitives into AQT #1598

integrate new float8 quantization primitives into AQT #1598

danielvegamyhre commented Jan 22, 2025 •

edited

Loading

danielvegamyhre commented Jan 22, 2025 •

edited

Loading

pytorch-bot bot commented Jan 22, 2025 •

edited

Loading

vkuzo Jan 23, 2025

danielvegamyhre Jan 23, 2025

vkuzo Jan 23, 2025

jainapurva Jan 23, 2025

integrate new float8 quantization primitives into AQT #1598

integrate new float8 quantization primitives into AQT #1598

Conversation

danielvegamyhre commented Jan 22, 2025 • edited Loading

danielvegamyhre commented Jan 22, 2025 • edited Loading

pytorch-bot bot commented Jan 22, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1598

✅ No Failures

vkuzo Jan 23, 2025

Choose a reason for hiding this comment

danielvegamyhre Jan 23, 2025

Choose a reason for hiding this comment

vkuzo Jan 23, 2025

Choose a reason for hiding this comment

jainapurva Jan 23, 2025

Choose a reason for hiding this comment

danielvegamyhre commented Jan 22, 2025 •

edited

Loading

danielvegamyhre commented Jan 22, 2025 •

edited

Loading

pytorch-bot bot commented Jan 22, 2025 •

edited

Loading