[super ugly maybe working code] use shim.h instead of Tensor #1548

janeyx99 · 2025-01-10T23:30:13Z

this is a poc change to see what making custom ops use aoti shim.h would look like

what you should expect by the end of this exercise:

a libtorch.h that will contain ABI stable content only. the goal for this file is that it will go live in libtorch as a header only stable portion
a libtorch.cpp that can call unstable things, because this will ultimately be upstreamed to core as a part of libtorch.so
tensor_core_tiled_layout.cu will only depend on libtorch.h

pytorch-bot · 2025-01-10T23:30:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1548

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCM Infra failures during checkout of PyTorch

❌ 9 New Failures

As of commit 5e2c2d0 with merge base b2fb664 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test (CPU 2.3, linux.4xlarge, torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t 322979872e794b83a11f0e0a0e80a9a96d4644adc684bf9a297181f7b8a418cd /exec failed with exit code 1
Run Regression Tests / test (CPU 2.4, linux.4xlarge, torch==2.4.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t 2a0a71f94bae3c24f0a8b855803b6a175c67c7f05cb8a80bd57c577e441f96ce /exec failed with exit code 1
Run Regression Tests / test (CPU 2.5.1, linux.4xlarge, torch==2.5.1 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t 1368e1167e52bbfcba1fdfdc8869d4a6b5d180da4ca0ab868431d39fda5a2dbb /exec failed with exit code 1
Run Regression Tests / test (CUDA 2.3, linux.g5.12xlarge.nvidia.gpu, torch==2.3.0, cuda, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t e5d71d9d541782120813ca1c18ad97bba95cf54f99b233bdbe43a744ded87496 /exec failed with exit code 139
Run Regression Tests / test (CUDA 2.4, linux.g5.12xlarge.nvidia.gpu, torch==2.4.0, cuda, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 117b1375e4ca6b83eaf92a735ce4bcfef158428a2f84708dd8f3d54c9bc0a2dc /exec failed with exit code 139
Run Regression Tests / test (CUDA 2.5.1, linux.g5.12xlarge.nvidia.gpu, torch==2.5.1 --index-url https://download.pytorch... / linux-job (gh)
test/sparsity/test_marlin.py::SparseMarlin24::test_quant_sparse_marlin_layout_eager
Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch==2.7.0.dev20250122 --index-url https://down... / linux-job (gh)
RuntimeError: Command docker exec -t 23c2da5575a5aa9689d6631d84eeaedf87c8cdcd41a8afbd32d145d13b55a883 /exec failed with exit code 1
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch==2.7.0.dev20250122 --index-... / linux-job (gh)
RuntimeError: Command docker exec -t b578215d4fd8e68ecec5cf1bdb48e38f740af6c2104ce98b70660a00d5f8eeaa /exec failed with exit code 139
Run TorchAO Experimental Tests / test (macos-14) (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torchao/csrc/cuda/tensor_core_tiled_layout/tensor_core_tiled_layout.cu

kimishpatel · 2025-02-03T16:27:38Z

torchao/csrc/cuda/tensor_core_tiled_layout/libtorch.cpp

+    : lib_(&TorchLibraryOpaque(Library::Kind::IMPL, ns, k, file, line)) {}
+
+StableLibrary& StableLibrary::impl(std::string name, void (*fn)(void **, int64_t, int64_t)) {
+  auto boxed_function = [fn](const c10::OperatorHandle &op, torch::jit::Stack *stack) {


are these, torch::jit:* and IValue, considered stable?

no! I haven't gotten a chance to document what I'm trying to do, but everything in libtorch.cpp should eventually make its way back to libtorch (not stable!!)

everything in libtorch.h is supposed to be stable.

kimishpatel · 2025-02-03T16:30:38Z

torchao/csrc/cuda/tensor_core_tiled_layout/libtorch.cpp

+//              boxed_dequantize_tensor_core_tiled_layout>());
+// }
+
+class StableLibrary::TorchLibraryOpaque {


I am guessing that, from design perspective, this is the interface layer with registration API that is shipped with different versions of libtorch? But the user code, for custom ops, never relies on at::Tensor?

Yea, this is to allow for registering libtorch-agnostic custom ops which cannot use at::Tensor or IValue, etc. in their schema

kimishpatel · 2025-02-03T16:32:31Z

torchao/csrc/cuda/tensor_core_tiled_layout/tensor_core_tiled_layout.cu

-TORCH_LIBRARY_IMPL(torchao, CUDA, m) {
-  m.impl("torchao::unpack_tensor_core_tiled_layout", &_unpack_tensor_core_tiled_layout);
-  m.impl("torchao::dequantize_tensor_core_tiled_layout", &_dequantize_tensor_core_tiled_layout);
+void voidyvoid_boxed_ATH_unpack_tensor_core_tiled_layout(void **stack,


voidyvoid.. lol

on serious note though, will user have to write this function with void** stack? It would actually be good if user just write with stable API. LIke AtenTensorHandle

Yea, to be clear, there are two portions we expect the user to provide for now:

their custom op, which should directly use AtenTensorHandle (we can make that easier by wrapping AtenTensorHandle with a headeronly C++ API layer)

how their custom op should be registered within our dispatcher stack (this is the point of the voidyvoid function). There may be a way to automatically generate this for the user given the schema of their custom op from (1) but that is a next step. Currently, we are going to expect users to provide this.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 10, 2025

janeyx99 added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Jan 16, 2025

janeyx99 changed the title ~~[super ugly not working code] use shim.h instead of Tensor~~ [super ugly maybe working code] use shim.h instead of Tensor Jan 16, 2025

janeyx99 commented Jan 16, 2025

View reviewed changes

torchao/csrc/cuda/tensor_core_tiled_layout/tensor_core_tiled_layout.cu Outdated Show resolved Hide resolved

janeyx99 mentioned this pull request Jan 29, 2025

Stable C bindings for libtorch pytorch/pytorch#145656

Open

janeyx99 force-pushed the try-aoti-shim branch from e1c39be to 2017d7b Compare January 29, 2025 19:32

kimishpatel reviewed Feb 3, 2025

View reviewed changes

janeyx99 added 11 commits February 4, 2025 10:44

[super ugly not working code] use shim.h instead of Tensor

38b50b4

Cleaned up PoC

4d0cebf

Ignore ignore_this

abdae1e

add mock registration prototype

df85d15

there is a diff between IValue blah and IValue& blah

4120a8b

Clean up code, finish other end of void* boxed kernel

a1944ee

[skip ci] saving work on registration

672aeec

[skip ci] This definitely does not compile

97a9220

Now the code compiles

31c2925

Move commented out notes to the bottom to not distract

a0500e5

Remove dependency on change in core

5e2c2d0

janeyx99 force-pushed the try-aoti-shim branch from 0e979be to 5e2c2d0 Compare February 4, 2025 18:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[super ugly maybe working code] use shim.h instead of Tensor #1548

[super ugly maybe working code] use shim.h instead of Tensor #1548

janeyx99 commented Jan 10, 2025 •

edited

Loading

pytorch-bot bot commented Jan 10, 2025 •

edited

Loading

kimishpatel Feb 3, 2025

janeyx99 Feb 3, 2025

kimishpatel Feb 3, 2025

janeyx99 Feb 3, 2025

kimishpatel Feb 3, 2025

kimishpatel Feb 3, 2025

janeyx99 Feb 3, 2025

[super ugly maybe working code] use shim.h instead of Tensor #1548

Are you sure you want to change the base?

[super ugly maybe working code] use shim.h instead of Tensor #1548

Conversation

janeyx99 commented Jan 10, 2025 • edited Loading

pytorch-bot bot commented Jan 10, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1548

❗ 1 Active SEVs

❌ 9 New Failures

kimishpatel Feb 3, 2025

Choose a reason for hiding this comment

janeyx99 Feb 3, 2025

Choose a reason for hiding this comment

kimishpatel Feb 3, 2025

Choose a reason for hiding this comment

janeyx99 Feb 3, 2025

Choose a reason for hiding this comment

kimishpatel Feb 3, 2025

Choose a reason for hiding this comment

kimishpatel Feb 3, 2025

Choose a reason for hiding this comment

janeyx99 Feb 3, 2025

Choose a reason for hiding this comment

janeyx99 commented Jan 10, 2025 •

edited

Loading

pytorch-bot bot commented Jan 10, 2025 •

edited

Loading