Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: refactor cibuldwheel code #2

Open
wants to merge 32 commits into
base: master
Choose a base branch
from

Conversation

henryiii
Copy link

This reworks the cibuildwheel code to use pyproject.toml and support local runs with:

pipx run cibuildwheel --platform linux python

And gets rid of some hacks (like touching setup.py).

abitrolly added 30 commits June 13, 2022 18:05
There is only `setup.py.in` template
pypa/cibuildwheel#1139
Replacing `apt` with `dnf` and use latest `cibuildwheel`
Because custom Python can not see system installed modules
by pointing to the "libopendht.so.2" library location
@@ -61,11 +33,11 @@ jobs:
steps:
- uses: actions/download-artifact@v3
with:
name: opendht-wheels-linux
name: opendht-wheels
path: dist
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure missing this was a bug

@abitrolly
Copy link
Owner

I see what you've meant that setuptools supports Cython. https://setuptools.pypa.io/en/latest/userguide/ext_modules.html?highlight=cython#distributing-extensions-compiled-with-cython Why the official Cython docs don't recommend that https://cython.readthedocs.io/en/latest/src/quickstart/build.html#building-a-cython-module-using-setuptools ?

Looks like I need to setup https://test.pypi.org/ to test uploads for real.

@henryiii
Copy link
Author

henryiii commented Jun 22, 2022

Why the official Cython docs don't recommend that

Cython development is not fast, and its docs are even slower. It still recommends python setup.py install which is deprecated, tells you to install Xcode from a DVD, says the "latest" version of minGW is one from 2008, etc. Cythonize is also from Cython and probably has a few more options, and more controls for if Cython is missing (since this all predates pyproject.toml). But I really bet no one has bothered to try to update this in years.

'./configure',
'make',
'make install',
]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many things I find unsettling here.

  1. Python level file, which is supposed to be built in user space, contains system commands that alter my OS.

  2. Encoding bash scripts in TOML like that. The thought that I may need to copy/paste these commands for debugging. TOML supports multiline strings, so why not to use it here to be on par with CIBW_BEFORE_ALL?

  3. dependencies.txt is in parent dir, so dnf install won't install anything and should fail. It if doesn't then that's a problem.

  4. After cd asio-1.21.0 there is no cd back, so it is not clear in which dir cibuidwheel should start the process.

  5. The asio build process is only valid for before all on manylinux_2_28, because AlmaLinux 8 ships with outdated dependency.

So in the context of CI job having such hack is fine, but embedding it in Python project doesn't look good to me. So I think about moving shell parts into standalone build scripts, but then there is already CMake that is for doing builds. Oh my.. :D

Copy link
Author

@henryiii henryiii Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1: No it does not. This only runs inside the docker container, even locally. But you your point, it should be prefixed by linux just in case. macOS and Windows would run on your OS. And these lines wouldn't be valid anyway. Though there are no selected builds for those OSs, so it currently would do nothing if you target them.

2: Multiline strings would require the &&'ing together. But you could use that if you prefer. If you are debugging, you are probably running line by line anyway, or you are running cibuldwheel locally, in which there's no copy/paste. Select one based on personal taste. This is harder to make a mistake and forget the &&.

3: This is run from the parent dir. It's being run by cibuildwheel.

4: before-build is not tied to the per-build runs. It can't be, it runs once while the per-build runs run on each time. They are separate processes (with seperate Pythons, etc). So no need to clean up cd's.

5: You only support manylinux_2_28 right?

This is cibulidwheel config. Just like you had before. Only now it's shorter and can be run locally.

Copy link
Author

@henryiii henryiii Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I built the wheels locally to make sure this worked. That's how I was able to do it in a few minutes instead of 7 days of back & forth with the CI, and why it worked first time here as a PR. There were a few oddities I didn't expect (like Cython being required twice).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-line strings should obviously contain newline characters, which, when copy-pasted into a shell, are equivalent to ;, which is likewise equivalent to && except that one failing command will not prevent newline / ; from running the follow-up commands.

@abitrolly
Copy link
Owner

abitrolly commented Jun 23, 2022

But I see that from a beginners standpoint, Cython documentation approach is more simple.

graph LR
  ic[install Cython yourself] --> is[import in setup.py] --> sp[set paths in setup.py]
Loading

It just becomes more complicated with pyproject.toml.

graph LR
  cp[set Cython as build dependency] --> sbb[set build backend] --> ica[still install Cython manually for CMake] --> sps[set paths in setup.py]
Loading

So in the end it is just more moving parts.

@henryiii
Copy link
Author

The "still install Cython manually for CMake" is due to the buggy build system manually trying to access cython locally. That or Cython literally is running twice on the same input. I'd work on on fixing that rather than not putting a real build dependency in the place for build dependencies - pyproject.toml.

You really should flip the build system and run CMake from setup.py. The current system won't ever work locally without hacks like manually running.

You always have a build backend. You just are implicitly getting the terrible setuptools.build_meta:__legacy__ backend. If you don't trigger PEP 517 you might not get a modern setuptools or even wheel.

[tool.cibuildwheel]
build = 'cp3{9,10}-manylinux_x86_64'
manylinux-x86_64-image = 'manylinux_2_28'
before-all = [
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
before-all = [
linux.before-all = [

(You can also put these in [tool.cibuildwheel.linux] if you like that better, vs. adding linux. to the front)

@henryiii
Copy link
Author

I would absolutely not show this to a beginner. Beginners should learn to do things properly with setup.py calling CMake (pybind/scikit_build_example or pybind/cmake_example), and not see raw python setup.py calls or projects that don't support pip install from the SDist. Beginners should learn to always check pyproject.toml first, to see what system is used. They should not assume setup.py in 2022!

Also, this is not a "python level file". It's a config file. You can stick the config in cibuildwheel.toml, run with --config-file cibuildwheel.toml if you prefer, cibuildwheel just finds this one automatically, and that's what pyproject.toml for, configuring your py(thon) project. No other tool will try to run tool.cibuildwheel commands.

Anyway, thought I'd try to help. Close if you don't care about local builds, simpler configs, fewer hacks, only supporting Microsoft's CI system, etc.

@abitrolly
Copy link
Owner

I had a long running belief that pyproject.toml is the future that kills the non-deterministic setup.py that is haunting the security and code introspection teams. Now you're killing that belief. :D

Is there a good reason why setuptools doesn't support ext_modules configuration in pyproject.toml? So that this thing could be moved there.

      ext_modules = [
          Extension(
              "opendht",
              ["@CURRENT_SOURCE_DIR@/opendht.pyx"],
              include_dirs = ['@PROJECT_SOURCE_DIR@/include'],
              language="c++",
              extra_compile_args=["-std=c++17"],
              extra_link_args=["-std=c++17"],
              libraries=["opendht"],
              library_dirs = ['@CURRENT_BINARY_DIR@', '@PROJECT_BINARY_DIR@']
          ),
      ]

@henryiii
Copy link
Author

henryiii commented Jun 23, 2022

If you are building code, I think you have a lot more security problems than a bit of Python configuration. ;) (And if the code is already built, say in a wheel, the pyproject.toml is not even present any more - that's the security solution).

To the best of my knowledge, building extensions has always been something that has to sit in setup.py - it's never been possible in setup.cfg or now pyproject.toml. You'll normally (unless you generate it like you are doing which is much worse for inspection and security!) need various python things like accessing library (numpy, pybind11, etc) include directories and such. There's too much missing from setuptools (like the ability to select a C++ standard in a cross platform way, parallel builds, cached builds, etc) that I don't think there's anything at all to be gained by making it possible to statically list extensions.

Now I very much believe there is a path forward, it's just not setuptools. I'll be working on scikit-build-core, and extensionlib (possibly a matching PEP), and there's also meson-py. We need a real build system for compiled code (CMake or Meson) with a pyproject.toml configuration. And it's coming. ;)

@abitrolly
Copy link
Owner

We need a real build system for compiled code (CMake or Meson) with a pyproject.toml configuration. And it's coming. ;)

For me the solution is to reduce the clutter, not to throw more tools into it. If people know how the process should look like, they could do all the magic SCons or CMake. For that, we need to remove the tools from equation and follow data-based workflow.

We need to get opendht-2.4.5-cp310-cp310-manylinux_2_28_x86_64.whl. What are the missing data to construct it?


opendht-2.4.5-cp310-cp310-manylinux_2_28_x86_64.whl

$ tree -F --dirsfirst
./
├── opendht-2.4.5.dist-info/
│   ├── METADATA
│   ├── RECORD
│   ├── top_level.txt
│   └── WHEEL
├── opendht.libs/
│   ├── libargon2-cd2f1124.so.1*
│   ├── libffi-67dea7c6.so.6.0.2*
│   ├── libgmp-452f15e0.so.10.3.2*
│   ├── libgnutls-e85eaba6.so.30.28.2*
│   ├── libhogweed-cd4c53be.so.4.5*
│   ├── libidn2-2f4a5893.so.0.3.6*
│   ├── libnettle-37944285.so.6.5*
│   ├── libopendht-740c48d4.so.2.4.5*
│   ├── libp11-kit-4d4a5b6d.so.0.3.0*
│   ├── libtasn1-41874c32.so.6.5.5*
│   └── libunistring-05abdd40.so.2.1.0*
└── opendht.cpython-310-x86_64-linux-gnu.so*

What kind of info is needed to get all the .so in the layout here? Is it just the .pyx and .pxd sources and the version of target Python? If yes, then why I don't see why setup.py is needed here.

@henryiii
Copy link
Author

henryiii commented Jun 24, 2022

Most projects care about making a working SDist. Not every user will be able to access a wheel, like users with anything older than CentOS 8, Alpine Linux, Clear Linux, repackagers like conda-forge and linux distributions need the SDist, etc. If you make an SDist, you need a working Python build system. In your case, since the package must be created from CMake, due to the way it's designed with a generated setup.py, then you could probably generate the dist-info (it has things like a hash of the other wheel contents, but it can be done, see the spec. You could make the .so, with the correct name, and then make a linux wheel, which you'd then need to run through auditwheel.

Without a working Python build system you'll also have to give up cibuildwheel, as it requires either pip wheel or build --wheel to work. Though you probably could implement a local PEP 517 builder pretty easily, see Flit's own build system (the one it uses to build itself). You aren't interested in anything but one linux system, and don't have Python files to move around.

If people know how the process should look like, they could do all the magic SCons or CMake

No, please not more custom code that breaks every time anything changes! I maintain both cmake_example and scikit_build_example in pybind, and I'd love to see people stop using cmake_example - every time something changes, like Apple Silicon cross compiling, PyPy 3.9's changes, Windows on ARM, etc., I can update scikit-build once and everyone immediately benefits, and I can update cmake_example and every single user now has to manually update their code for the breakage. You should not program by copy-paste, but use well maintained tools.

Once I can get scikit-build to avoid the setuptools legacy code (that can often break it), then there should be no reason not to use it to build CMake code.

@aberaud
Copy link

aberaud commented Jun 27, 2022

The setup.py.in allows integration with cmake & autotools (-DOPENDHT_PYTHON or --enable-python will configure building the python bindings, make will build it, and make install will install it). Maybe there are better ways to do that, but setup.py.in seemed to be the easiest. It mostly worked fine for years without much change.

I'm open to merging a modernization of the python build for OpenDHT, or implementing any reasonable change in the build system if that makes people's life easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants