Skip to content

Commit b76570d

Browse files
authored
Recommend cpython agnosticism in cpp custom op tutorial (#3250)
* Recommend python agnosticism in cpp custom op tutorial * forgot to delete a line * Fixed tutorial to be clearer and to recommend other path * Switch to commits instead of master * Formatting code blocks * polish + i missed some code blocks earlier * Adjust advice based on Sam Gross's knowlede
1 parent 15ef015 commit b76570d

File tree

1 file changed

+150
-37
lines changed

1 file changed

+150
-37
lines changed

advanced_source/cpp_custom_ops.rst

+150-37
Original file line numberDiff line numberDiff line change
@@ -62,20 +62,78 @@ Using ``cpp_extension`` is as simple as writing the following ``setup.py``:
6262
6363
setup(name="extension_cpp",
6464
ext_modules=[
65-
cpp_extension.CppExtension("extension_cpp", ["muladd.cpp"])],
66-
cmdclass={'build_ext': cpp_extension.BuildExtension})
65+
cpp_extension.CppExtension(
66+
"extension_cpp",
67+
["muladd.cpp"],
68+
# define Py_LIMITED_API with min version 3.9 to expose only the stable
69+
# limited API subset from Python.h
70+
extra_compile_args={"cxx": ["-DPy_LIMITED_API=0x03090000"]},
71+
py_limited_api=True)], # Build 1 wheel across multiple Python versions
72+
cmdclass={'build_ext': cpp_extension.BuildExtension},
73+
options={"bdist_wheel": {"py_limited_api": "cp39"}} # 3.9 is minimum supported Python version
74+
)
6775
6876
If you need to compile CUDA code (for example, ``.cu`` files), then instead use
6977
`torch.utils.cpp_extension.CUDAExtension <https://pytorch.org/docs/stable/cpp_extension.html#torch.utils.cpp_extension.CUDAExtension>`_.
7078
Please see `extension-cpp <https://github.com/pytorch/extension-cpp>`_ for an
7179
example for how this is set up.
7280

73-
Starting with PyTorch 2.6, you can now build a single wheel for multiple CPython
74-
versions (similar to what you would do for pure python packages). In particular,
75-
if your custom library adheres to the `CPython Stable Limited API
76-
<https://docs.python.org/3/c-api/stable.html>`_ or avoids CPython entirely, you
77-
can build one Python agnostic wheel against a minimum supported CPython version
78-
through setuptools' ``py_limited_api`` flag, like so:
81+
The above example represents what we refer to as a CPython agnostic wheel, meaning we are
82+
building a single wheel that can be run across multiple CPython versions (similar to pure
83+
Python packages). CPython agnosticism is desirable in minimizing the number of wheels your
84+
custom library needs to support and release. The minimum version we'd like to support is
85+
3.9, since it is the oldest supported version currently, so we use the corresponding hexcode
86+
and specifier throughout the setup code. We suggest building the extension in the same
87+
environment as the minimum CPython version you'd like to support to minimize unknown behavior,
88+
so, here, we build the extension in a CPython 3.9 environment. When built, this single wheel
89+
will be runnable in any CPython environment 3.9+. To achieve this, there are three key lines
90+
to note.
91+
92+
The first is the specification of ``Py_LIMITED_API`` in ``extra_compile_args`` to the
93+
minimum CPython version you would like to support:
94+
95+
.. code-block:: python
96+
97+
extra_compile_args={"cxx": ["-DPy_LIMITED_API=0x03090000"]},
98+
99+
Defining the ``Py_LIMITED_API`` flag helps verify that the extension is in fact
100+
only using the `CPython Stable Limited API <https://docs.python.org/3/c-api/stable.html>`_,
101+
which is a requirement for the building a CPython agnostic wheel. If this requirement
102+
is not met, it is possible to build a wheel that looks CPython agnostic but will crash,
103+
or worse, be silently incorrect, in another CPython environment. Take care to avoid
104+
using unstable CPython APIs, for example APIs from libtorch_python (in particular
105+
pytorch/python bindings,) and to only use APIs from libtorch (ATen objects, operators
106+
and the dispatcher). We strongly recommend defining the ``Py_LIMITED_API`` flag to
107+
help ascertain the extension is compliant and safe as a CPython agnostic wheel. Note that
108+
defining this flag is not a full guarantee that the built wheel is CPython agnostic, but
109+
it is better than the wild wild west. There are several caveats mentioned in the
110+
`Python docs <https://docs.python.org/3/c-api/stable.html#limited-api-caveats>`_,
111+
and you should test and verify yourself that the wheel is truly agnostic for the relevant
112+
CPython versions.
113+
114+
The second and third lines specifying ``py_limited_api`` inform setuptools that you intend
115+
to build a CPython agnostic wheel and will influence the naming of the wheel accordingly:
116+
117+
.. code-block:: python
118+
119+
setup(name="extension_cpp",
120+
ext_modules=[
121+
cpp_extension.CppExtension(
122+
...,
123+
py_limited_api=True)], # Build 1 wheel across multiple Python versions
124+
...,
125+
options={"bdist_wheel": {"py_limited_api": "cp39"}} # 3.9 is minimum supported Python version
126+
)
127+
128+
It is necessary to specify ``py_limited_api=True`` as an argument to CppExtension/
129+
CUDAExtension and also as an option to the ``"bdist_wheel"`` command with the minimal
130+
supported CPython version (in this case, 3.9). Consequently, the ``setup`` in our
131+
tutorial would build one properly named wheel that could be installed across multiple
132+
CPython versions ``>=3.9``.
133+
134+
If your extension uses CPython APIs outside the stable limited set, then you cannot
135+
build a CPython agnostic wheel! You should build one wheel per CPython version instead,
136+
like so:
79137

80138
.. code-block:: python
81139
@@ -86,28 +144,10 @@ through setuptools' ``py_limited_api`` flag, like so:
86144
ext_modules=[
87145
cpp_extension.CppExtension(
88146
"extension_cpp",
89-
["python_agnostic_code.cpp"],
90-
py_limited_api=True)],
147+
["muladd.cpp"])],
91148
cmdclass={'build_ext': cpp_extension.BuildExtension},
92-
options={"bdist_wheel": {"py_limited_api": "cp39"}}
93149
)
94150
95-
Note that you must specify ``py_limited_api=True`` both within ``setup``
96-
and also as an option to the ``"bdist_wheel"`` command with the minimal supported
97-
Python version (in this case, 3.9). This ``setup`` would build one wheel that could
98-
be installed across multiple Python versions ``python>=3.9``. Please see
99-
`torchao <https://github.com/pytorch/ao>`_ for an example.
100-
101-
.. note::
102-
103-
You must verify independently that the built wheel is truly Python agnostic.
104-
Specifying ``py_limited_api`` does not check for any guarantees, so it is possible
105-
to build a wheel that looks Python agnostic but will crash, or worse, be silently
106-
incorrect, in another Python environment. Take care to avoid using unstable CPython
107-
APIs, for example APIs from libtorch_python (in particular pytorch/python bindings,)
108-
and to only use APIs from libtorch (aten objects, operators and the dispatcher).
109-
For example, to give access to custom ops from Python, the library should register
110-
the ops through the dispatcher (covered below!).
111151
112152
Defining the custom op and adding backend implementations
113153
---------------------------------------------------------
@@ -252,16 +292,89 @@ matters (importing in the wrong order will lead to an error).
252292

253293
To use the custom operator with hybrid Python/C++ registrations, we must
254294
first load the C++ library that holds the custom operator definition
255-
and then call the ``torch.library`` registration APIs. This can happen in one
256-
of two ways:
257-
258-
1. If you're following this tutorial, importing the Python C extension module
259-
we created will load the C++ custom operator definitions.
260-
2. If your C++ custom operator is located in a shared library object, you can
261-
also use ``torch.ops.load_library("/path/to/library.so")`` to load it. This
262-
is the blessed path for Python agnosticism, as you will not have a Python C
263-
extension module to import. See `torchao __init__.py <https://github.com/pytorch/ao/blob/881e84b4398eddcea6fee4d911fc329a38b5cd69/torchao/__init__.py#L26-L28>`_
264-
for an example.
295+
and then call the ``torch.library`` registration APIs. This can happen in
296+
three ways:
297+
298+
299+
1. The first way to load the C++ library that holds the custom operator definition
300+
is to define a dummy Python module for _C. Then, in Python, when you import the
301+
module with ``import _C``, the ``.so`` files corresponding to the extension will
302+
be loaded and the ``TORCH_LIBRARY`` and ``TORCH_LIBRARY_IMPL`` static initializers
303+
will run. One can create a dummy Python module with ``PYBIND11_MODULE`` like below,
304+
but you will notice that this does not compile with ``Py_LIMITED_API``, because
305+
``pybind11`` does not promise to only use the stable limited CPython API! With
306+
the below code, you sadly cannot build a CPython agnostic wheel for your extension!
307+
(Foreshadowing: I wonder what the second way is ;) ).
308+
309+
.. code-block:: cpp
310+
311+
// in, say, not_agnostic/csrc/extension_BAD.cpp
312+
#include <pybind11/pybind11.h>
313+
314+
PYBIND11_MODULE("_C", m) {}
315+
316+
.. code-block:: python
317+
318+
# in, say, extension/__init__.py
319+
from . import _C
320+
321+
2. In this tutorial, because we value being able to build a single wheel across multiple
322+
CPython versions, we will replace the unstable ``PYBIND11`` call with stable API calls.
323+
The below code compiles with ``-DPy_LIMITED_API=0x03090000`` and successfully creates
324+
a dummy Python module for our ``_C`` extension so that it can be imported from Python.
325+
See `extension_cpp/__init__.py <https://github.com/pytorch/extension-cpp/blob/38ec45e/extension_cpp/__init__.py>`_
326+
and `extension_cpp/csrc/muladd.cpp <https://github.com/pytorch/extension-cpp/blob/38ec45e/extension_cpp/csrc/muladd.cpp>`_
327+
for more details:
328+
329+
.. code-block:: cpp
330+
331+
#include <Python.h>
332+
333+
extern "C" {
334+
/* Creates a dummy empty _C module that can be imported from Python.
335+
The import from Python will load the .so consisting of this file
336+
in this extension, so that the TORCH_LIBRARY static initializers
337+
below are run. */
338+
PyObject* PyInit__C(void)
339+
{
340+
static struct PyModuleDef module_def = {
341+
PyModuleDef_HEAD_INIT,
342+
"_C", /* name of module */
343+
NULL, /* module documentation, may be NULL */
344+
-1, /* size of per-interpreter state of the module,
345+
or -1 if the module keeps state in global variables. */
346+
NULL, /* methods */
347+
};
348+
return PyModule_Create(&module_def);
349+
}
350+
}
351+
352+
.. code-block:: python
353+
354+
# in, say, extension/__init__.py
355+
from . import _C
356+
357+
3. If you want to avoid ``Python.h`` entirely in your C++ custom operator, you may
358+
use ``torch.ops.load_library("/path/to/library.so")`` in Python to load the ``.so``
359+
file(s) compiled from the extension. Note that, with this method, there is no ``_C``
360+
Python module created for the extension so you cannot call ``import _C`` from Python.
361+
Instead of relying on the import statement to trigger the custom operators to be
362+
registered, ``torch.ops.load_library("/path/to/library.so")`` will do the trick.
363+
The challenge then is shifted towards understanding where the ``.so`` files are
364+
located so that you can load them, which is not always trivial:
365+
366+
.. code-block:: python
367+
368+
import torch
369+
from pathlib import Path
370+
371+
so_files = list(Path(__file__).parent.glob("_C*.so"))
372+
assert (
373+
len(so_files) == 1
374+
), f"Expected one _C*.so file, found {len(so_files)}"
375+
torch.ops.load_library(so_files[0])
376+
377+
from . import ops
265378
266379
267380
Adding training (autograd) support for an operator

0 commit comments

Comments
 (0)