Skip to content

feat: enable conversions between native Python enum types and C++ enums #5555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Mar 25, 2025

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Mar 8, 2025

Description

This PR requires a PYBIND11_INTERNALS_VERSION bump (from 7 to 8).


For easy access during the review: Updated documentation


Alternative to py::enum_, binding C++ enum types to native Python enum types, typically types in Python's stdlib enum module. For example:

#include <pybind11/pybind11.h>
#include <pybind11/native_enum.h> // Not already included with pybind11.h

enum class altitude : char {
    high = 'h',
    low = 'l'
};

PYBIND11_MODULE(native_enum_demo, m) {
    py::native_enum<altitude>(m, "altitude")
             .value("high", altitude::high)
             .value("low", altitude::low)
             .finalize();
}

The .finalize() call is needed because Python's native enums cannot be built incrementally — all name/values need to be passed at once. To achieve this, py::native_enum acts as a buffer to collect the name/value pairs. The .finalize() call uses the accumulated name/value pairs to build the arguments for constructing a native Python enum type.

The established py::enum_ is not PEP 435 compatible (see also #2332). py::native_enum provides an alternative without breaking backwards compatibility. Ideally, existing uses of py::enum_ should migrate to py::native_enum, but py::enum_ will remain supported indefinitely.

In rare cases it may be necessary to specialize pybind11::detail::type_caster_enum_type_enabled:

#if defined(PYBIND11_HAS_NATIVE_ENUM)                                   
namespace pybind11::detail {                                            
template <typename FancyEnum>                                           
struct type_caster_enum_type_enabled<                                   
    FancyEnum,                                                          
    std::enable_if_t<is_fancy_enum<FancyEnum>::value>> : std::false_type {};
}                                                                       
#endif                                                                  

This specialization is needed only if the custom type caster is templated.

The PYBIND11_HAS_NATIVE_ENUM guard is needed only if backward compatibility with pybind11v2 is required.


This PR sidesteps issues discovered under #2744.

This PR fixes #3643 (comment), more-or-less as a side-effect.


cloc --diff results for include/ between current master and this PR:

-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C/C++ Header
 same                           39           1816           3347          16428
 modified                        5              0              0              6
 added                           2             53             18            332
 removed                         0              1              0              7

I.e., this PR increases the production code size by 2.0%.


This PR is based on google/pybind11clif#30005 (including accumulated follow-on fixes) and google/pybind11clif#30140 (remove_cross_extension_shared_state branch).

Until August 2024, this code was included in PyCLIF-pybind11 global testing in the Google environment, which provided thousands of diverse use cases that needed to satisfy many million unit tests (the PyCLIF-pybind11 success was about 99.999%).

Suggested changelog entry:

``py::native_enum`` was added, for conversions between Python's native (stdlib) enum types and C++ enums.

@rwgk rwgk changed the title Conversions between Python's native (stdlib) enum types and C++ enums. Enable Conversions Between Python's Native (stdlib) Enum Types and C++ Enums Mar 8, 2025
@rwgk rwgk marked this pull request as ready for review March 8, 2025 23:05
@rwgk rwgk requested a review from henryiii as a code owner March 8, 2025 23:05
TEST_SUBMODULE(native_enum, m) {
using namespace test_native_enum;

m += py::native_enum<smallenum>("smallenum", py::native_enum_kind::IntEnum)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API is pretty different than anything else in pybind11, since most everything else is in the form py::TYPE<CTYPE> varname(scope, "NAME")?

... I understand why, but the += is really weird and is just not doing it for me. I asked ChatGPT what it thought such an API would look like, and it's basically the same thing I was thinking (except I started with commit instead of finalize, but I think finalize is better:

 auto my_enum = py::native_enum<MyEnum>(m, "MyEnum")
        .value("VALUE_A", MyEnum::VALUE_A)
        .value("VALUE_B", MyEnum::VALUE_B)
        .value("VALUE_C", MyEnum::VALUE_C);

    my_enum.finalize(); 

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please share your ChatGPT prompt and the response? Usually it's giving constructive feedback (e.g. enumerates alternatives), it'd be great to see that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't because of the way I asked it, but sure: https://chatgpt.com/share/67cd0f1f-6df8-800f-b092-b459cea53485

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful, thanks!

Here is my continuation:

https://chatgpt.com/share/67cd1767-d63c-8008-acb8-4c50cf739e85

It's explaining three options to preserve the safety net mechanism that I have in place. I'll work through that asap.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, I spent half an hour on a really good conversation with ChatGPT, coming to a great conclusion, but I was accidentally not logged in (wrong profile), I tried to login, that failed, and now the convo is lost 😭.

I'll try to see if I can redo it with the same level of authenticity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out there is no way to get back the original conversation. ChatGPT took a different turn, maybe because my prompt wasn't exactly as before.

But, I think the conclusion is now even better!

Here is the new convo:

And here the conclusion for easy reference:

py::native_enum<MyEnum>(m, "MyEnum")
    .value("VALUE_A", MyEnum::VALUE_A)
    .value("VALUE_B", MyEnum::VALUE_B)
    .value("VALUE_C", MyEnum::VALUE_C)
    .finalize();

I'll take a stab at that. I like it a lot better myself.

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 9, 2025

@virtuald FYI, I made the API change, but still need to change the error messages (they still direct users to use += instead of .finalize()).

Copy link
Contributor

@virtuald virtuald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested it, but the tests you added seem comprehensive enough so I'm sure it's fine.

If it were possible to make this like the other types instead of having its own separate type map in internals, I'd definitely prefer that... it seems like this could introduce special cases that might accidentally introduce bugs? Looking at the type_info struct it of course has a lot of extra things that this wouldn't need, but I suspect maybe we could jam it in?

If you had looked at that previously and decided it wasn't possible, then I accept that and I'm not asking for additional research.

Edit: .. and even if you haven't done that research, it's probably not necessary. I like the new API change though.

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 9, 2025

Assuming GitHub Actions are passing now, this is now feature complete; the error message is updated.

Things I still want to do:

  • Add tests for .finalize().value(...) and .finalize().finalize() error handling.

  • See if there is a reasonably easy way to make some native_enum_data member functions inaccessible through the public API (I need them to be accessible for testing).

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 9, 2025

Looking at the type_info struct it of course has a lot of extra things that this wouldn't need, but I suspect maybe we could jam it in?

I very much rather not do that. I have a lot of bad experiences trying to jam things together.

We have a golden opportunity right now to change the internals struct, to keep the implementation clean, which is why I'm sinking my time into this PR now.

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 10, 2025

All done with

  • error handling update,
  • public API cleanup,
  • PR description update,
  • docs/classes.rst update.

@virtuald Could you please take another look, especially at the docs/classes.rst update?

@timohl It would be awesome if you could look, too.

For easy access: https://pybind11--5555.org.readthedocs.build/en/5555/classes.html#enumerations

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 10, 2025

All done with

  • error handling update,
  • public API cleanup,
  • PR description update,
  • docs/classes.rst update.

@virtuald Could you please take another look, especially at the docs/classes.rst update?

@timohl It would be awesome if you could look, too.

For easy access: https://pybind11--5555.org.readthedocs.build/en/5555/classes.html#enumerations

It crossed my mind, I totally forgot about adding this to the documentation (only the PR description there is interesting):

I'll do that asap. It will not change the code, only the documentation...

unless we decide we want py::native_enum_kind::IntEnum as the default?

@virtuald
Copy link
Contributor

I haven't thought super hard about this, but I don't think IntEnum should be default.


.. warning::

Contrary to Python customs, enum values from the wrappers should not be compared using ``is``, but with ``==`` (see `#1177 <https://github.com/pybind/pybind11/issues/1177>`_ for background).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Is this warning captured anywhere else? It doesn't appear to be, but if we're keeping the old enum stuff it probably should be somewhere? Maybe this (and other things that were removed) could be added in a separate note?

.. note:: The deprecated `py::enum_` differs from standard python enums in the following ways:  ... 

Copy link
Collaborator Author

@rwgk rwgk Mar 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 10, 2025

I haven't thought super hard about this, but I don't think IntEnum should be default.

With this PR as-is, we always need to specify py::native_enum_kind (I just updated the example in the PR description; the documentation still needs to be updated):

py::native_enum<my_chars>("smallenum", py::native_enum_kind::Enum) ...
py::native_enum<my_ints>("smallenum", py::native_enum_kind::IntEnum) ...

py::native_enum_kind has only two choices, which gives us three API design options:

  • No default (this PR as-is).
  • py::native_enum_kind::Enum as the default.
  • py::native_enum_kind::IntEnum as the default.

Back when I created the pybind11clif PR, I thought having no default was best because it forces people to actively think about the choice (copy-paste-think vs copy-paste-move-on).

No default is still my preference. It's slightly inconvenient when writing enum bindings since you have to pause and make a decision. But then again, if there's a default, there’s an unknown (but non-zero) chance that someone might wind up with the wrong option and only realize it when it’s too late — e.g. after a release has already been made.

Regarding the "unknown": Do others feel differently? For example, is it the case that in 99% of use cases, Enum or IntEnum would be the better default?

@henryiii
Copy link
Collaborator

henryiii commented Mar 10, 2025

Haven't looked at this closely, but I prefer no default or Enum, making IntEnum the default seems really weird, since that's not the case in Python, and generally you want the more limited API the default - IntEnum just adds extra API a user might not want. So making Enum the default seems most reasonable, with no default being a close second. What about StrEnum (Python 3.11+)?

Also not very fond of making enum_ now permanently be spelled native_enum, but don't have a better solution, other than maybe we can remove enum_ in 4.0 in a few years and then restore enum_ as an alias to native_enum in 4.1 or so. Maybe we could even rename enum_ to classic_enum with enum_ just being an alias to that. With that, does native_enum_kind maybe make more sense as enum_kind? Since there isn't one for the old enums? Also, "native" seems a little odd, it's not clear to me that's related to Python. What about std_enum? Though that also isn't really clearly the Python stdlib. Historians might like enum34, as that's what the old backport package was called. :) I'm fine with native_enum, though.

How different is the official enum API to the one we provided?

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 11, 2025

@virtuald wrote:

It doesn't appear to be, but if we're keeping the old enum stuff it probably should be somewhere?

OK, I'll put back some info about enum_.

@Henry wrote:

So making Enum the default seems most reasonable

OK, I'll do that.

What about StrEnum (Python 3.11+)?

I totally missed that, I'll take a look, thanks.

Also not very fond of making enum_ now permanently be spelled native_enum

Yep, but breaking a bunch of things just to insist on a certain name ... it really doesn't seem worth anyone's time to me.

How different is the official enum API to the one we provided?

I didn't look systematically. Based on glancing through #2332 I decided a systematic comparison is too messy.

I'll probably only get back here next weekend. It's turning into more work than I anticipated. I appreciate the feedback.

@henryiii
Copy link
Collaborator

You don't need to add StrEnum, especially in this PR, I just think it would be good to make sure the design would allow it in the future, though. It's another specialization, like IntEnum.

Comment on lines 20 to 25
template <typename Type>
class native_enum : public detail::native_enum_data {
public:
using Underlying = typename std::underlying_type<Type>::type;

native_enum(const object &parent_scope, const char *name, enum_kind kind = enum_kind::Enum)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it maybe make future extensions to StrEnum or other enum kinds easier if enum_kind kind would be a template parameter instead of a constructor argument?

Currently, kind is only used in enum_kind_to_string, but I could imagine, that other enum kinds would require a different specialization of native_enum or native_enum_data, e.g. to pass or store the string representations.

I am thinking of something like this:

template <typename Type, enum_kind kind = enum_kind::Enum>
class native_enum : public detail::native_enum_data<kind> {
public:
    using Underlying = typename std::underlying_type<Type>::type;

    native_enum(const object &parent_scope, const char *name)
...

and

py::native_enum<Pet::Kind, py::enum_kind::IntEnum>(pet, "Kind")

While I do not have specific implementations of the other Enum kinds in mind, I think this would offer more flexibility later while being only a small change now.
But please tell me if I am overlooking any downsides.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timohl Thanks for asking that question.

After thinking about it for a few minutes, I realized (face-palm level) that py::enum_kind is going completely in the wrong direction. I eliminated it now: commit 8efb9bf

I have 3 WIP-TODO in that commit.

I also want to try StrEnum again, maybe we can accommodate that via something like

        py::native_enum<greek>(m, "greek", "enum.StrEnum")
            .value("Alpha", greek::Alpha, "alpha")
            .value("Omega", greek::Omega, "omega")
            .finalize();

?

This will have to wait until next weekend, though.

Copy link
Collaborator Author

@rwgk rwgk Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding StrEnum, I played around for an hour or so, but then decided this isn't worth extra code. See the commit message.

@rwgk rwgk changed the title Enable Conversions Between Python's Native (stdlib) Enum Types and C++ Enums Enable Conversions Between Native Python Enum Types and C++ Enums Mar 23, 2025
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 23, 2025

@henryiii @timohl @virtuald Could you please take another look? Hopefully this is final now.

  • py::enum_kind is gone. The API is simply py::native_enum<altitude>(m, "altitude", "enum.Enum") now, with "enum.Enum" as the default. See the updated documentation. I believe with this we maximize flexibility and ease of use.

  • The related code changes — relative to what you reviewed before — are slightly messy at first sight but really just boilerplate changes.

  • The only noteworthy new block of code is the import_or_getattr() function in include/pybind11/detail/native_enum_data.h. Most of the code is for comprehensive error reporting.

@timohl
Copy link
Contributor

timohl commented Mar 23, 2025

Does it already support all standard Python enums, e.g., StrEnum, by just changing the string and if not, is there a clear way of extending?
I see the three requirements for the passed type name in the docs and after a quick look through the tests, I see enum.IntEnum and enum.IntFlag.
Could we maybe add small tests for all supported standard Python enums (see Python docs).

I will take a deeper look later today.

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 23, 2025

Does it already support all standard Python enums, e.g., StrEnum, by just changing the string and if not, is there a clear way of extending?

I don't think I shared this before (the convo is from last weekend):

There are only two prompts. In the first I asked ChatGPT to explain StrEnum, in the second I asked "Is there are meaningful mapping to C++ enums?"

My interpretation of the answer: Not really.

But, with the new native_type_name API I believe you could implement a Python type that, roughly speaking, has the same constructor as enum.Enum, and the same behavior as enum.StrEnum.

Is that a wise thing to spend effort on? — I doubt it. — Certainly it's out of the realm I want to spend my effort on.

I see the three requirements for the passed type name in the docs and after a quick look through the tests, I see enum.IntEnum and enum.IntFlag. Could we maybe add small tests for all supported standard Python enums (see Python docs).

C++ enums are different from Python enums, roughly speaking again, a subset. Trying to stretch that subset to fully map to the larger set is a mistake IMO, especially because you can now implement Python types to handle specific use cases.

I added the simple enum.IntFlag test to convince myself that no production code changes are needed. I decided to not add e.g. enum.Flag because it seems too obvious to me that we'd just be turning test_native_enum.py into a pseudo unit test for enum, which isn't a good direction again, IMO.

Copy link
Contributor

@timohl timohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your explanation.
I think the code looks good as is and makes sense for the purpose of creating Python bindings for C++ code.

Just for the documentation, in addition to the three requirements, it would be nice to explicitly list what Python stdlib Enums are working and which are not.
Then, newcomers would not have to study the Python docs or go by trial and error to find if StrEnum or IntFlag works.
Currently, Enum and IntEnum are mentioned in the docs.

docs/classes.rst Outdated

.. code-block:: pycon
* Has a constructor similar to that of enum.Enum_::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates a link to:
docs.python.org/3/library/enum.html#enum.Enum
I think a link to this section should be more relevant to the functional style constructor:
docs.python.org/3/howto/enum.html#functional-api

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: commit cbb7494

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 24, 2025

Just for the documentation, in addition to the three requirements, it would be nice to explicitly list what Python stdlib Enums are working and which are not. Then, newcomers would not have to study the Python docs or go by trial and error to find if StrEnum or IntFlag works. Currently, Enum and IntEnum are mentioned in the docs.

Done: commit ae07bd2

Thanks for the review!

(I did NOT systematically try the other enum module types, but I know for sure that StrEnum is not compatible with the requirements, and I believe all the other types are obviously not compatible, or are obviously not a fit at all.)

@henryiii
Copy link
Collaborator

henryiii commented Mar 25, 2025

When we write the upgrade guide, we'll want to include py::enum_ -> py::native_enum in it.

Nice PR number. :)

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 25, 2025

Thanks for the reviews! The API changes made this much better.

@rwgk rwgk merged commit f365314 into pybind:master Mar 25, 2025
74 checks passed
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Mar 25, 2025
@rwgk rwgk deleted the native_enum_master branch March 25, 2025 03:33
@henryiii henryiii changed the title Enable Conversions Between Native Python Enum Types and C++ Enums feat: enable conversions between native Python enum types and C++ enums Mar 31, 2025
@henryiii henryiii removed the needs changelog Possibly needs a changelog entry label Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants