Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically add build-system.requires to nativeBuildInputs #568

Open
GuillaumeDesforges opened this issue Mar 8, 2022 · 31 comments
Open

Comments

@GuillaumeDesforges
Copy link

GuillaumeDesforges commented Mar 8, 2022

EDIT: original title was "flit_core backend missing"

Describe the issue

In entrypoints-0.4 the backend is flit_core.buildapi which is specified in requires in build-system.
https://github.com/takluyver/entrypoints/blob/ebdf2d8edc9921427ea07688851999796093c240/pyproject.toml#L2-L3

Trying to build entrypoints in a poetry2nix mkPoetryEnv fails on build with

ModuleNotFoundError: No module named 'flit_core'

Additional context

Entering nix shell I can check that flit_core is not available (not in pip list)
It's weird because it seems to me that poetry2nix reads the backend fields

getBuildSystemPkgs =

  • default.nix/shell.nix/flake.nix
{
  inputs.flake-utils.url = "github:numtide/flake-utils";
  inputs.poetry2nix.url = "github:nix-community/poetry2nix";

  outputs = { self, nixpkgs, flake-utils, poetry2nix, ... }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        pkgs = import nixpkgs {
          inherit system;
          overlay = [ poetry2nix.overlay ];
        };
      in
      {
        devShell =
          pkgs.poetry2nix.mkPoetryEnv {
            projectDir = ./.;
          };
      });
}
  • pyproject.toml
[ tool.poetry ]
name = "poetry-nix"
version = "0.1.0"
description = ""
authors = ["Guillaume Desforges <[email protected]>"]

[tool.poetry.dependencies]
python = "^3.8"
entrypoints = "^0.4"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
`poetry.lock`
[[package]]
name = "entrypoints"
version = "0.4"
description = "Discover and load entry points from installed packages."
category = "main"
optional = false
python-versions = ">=3.6"

[metadata]
lock-version = "1.1"
python-versions = "^3.8"
content-hash = "5d84993b5abc54589527347f1f6dcea0edde667a81ae537884ae1585c42b951a"

[metadata.files]
entrypoints = [
    {file = "entrypoints-0.4-py3-none-any.whl", hash = "sha256:f174b5ff827504fd3cd97cc3f8649f3693f51538c7e4bdf3ef002c8429d42f9f"},
    {file = "entrypoints-0.4.tar.gz", hash = "sha256:b706eddaa9218a19ebcd67b56818f05bb27589b1ca9e8d797b74affad4ccacd4"},
]
@flokli
Copy link
Contributor

flokli commented Mar 8, 2022

I just ran into this while trying to add structlog to a poetry2nix-packaged environment.

@flokli
Copy link
Contributor

flokli commented Mar 8, 2022

I read some code a bit, and added some traces.

It seems in general, there's all tooling in place to parse build-system.build-backend, and to add these dependencies.

However, it seems buildSystemPkgs isn't added to buildInputs (due to the ++ lib.optional isDirectory buildSystemPkgs in mk-poetry-dep.nix).

I don't quite understand the bigger picture yet, why this only applies to isDirectory, not isDirectory || isGit || isUrl like in other places. It also seems source is null for some reason.

@adisbladis, any idea?

@flokli
Copy link
Contributor

flokli commented Mar 8, 2022

I'm able to workaround this by manually bringing in a flit-core, via the overrides mechanism:

poetry2nix.mkPoetryApplication {
  projectDir = ./.;
  python = python310;
  overrides = poetry2nix.overrides.withDefaults (self: super: {
    # …
    # workaround https://github.com/nix-community/poetry2nix/issues/568
    structlog = super.structlog.overridePythonAttrs (old: {
      buildInputs = old.buildInputs or [ ] ++ [ python310.pkgs.flit-core ];
    });
  });
}

So it seems this indeed just a matter of not automatically bringing in the build system packages for dependencies.

@GuillaumeDesforges
Copy link
Author

GuillaumeDesforges commented Mar 8, 2022

Yes, I'm using this workaround as well.

I've also seen that the build system dependencies are hard coded for now for a subset of packages.
https://github.com/nix-community/poetry2nix/blob/master/overrides/build-systems.json

It would be nice to parse the requires field in pyproject when available and add as nativeBuildInput.

@flokli
Copy link
Contributor

flokli commented Mar 8, 2022

Yeah, but the tooling should be there to just derive this from pyproject.toml. @GuillaumeDesforges do you want to rename this issue to reflect this?

@GuillaumeDesforges GuillaumeDesforges changed the title flit_core backend missing Automatically add build-system.requires to nativeBuildInputs Mar 8, 2022
@adisbladis
Copy link
Member

The root cause of this issue is upstream python-poetry/poetry#2789.

@GuillaumeDesforges
Copy link
Author

Since poetry does not write the build-system.requires to the lockfile, poetry2nix can't really get that information indeed.

@flokli
Copy link
Contributor

flokli commented Mar 12, 2022

Huuh. Can we link to the upstream issue somewhere in the poetry2nix docs, and provide some guidance on how we seem to be using the overrides mechanism to manually keep track of these dependencies?

@yajo
Copy link
Contributor

yajo commented Apr 11, 2022

Maybe could we include https://github.com/DavHau/pypi-deps-db and fetch the desired build system from there? Just like mach-nix does.

@andersk
Copy link
Contributor

andersk commented May 19, 2022

The upstream issue python-poetry/poetry#2789 was fixed, but the fix uses ephemeral build environments, and the information needed to reconstruct them is not recorded in the lock file. Perhaps another upstream issue should be opened explaining why it should be recorded?

@yajo
Copy link
Contributor

yajo commented May 19, 2022

I think upstream understands why it's important, but not how to implement it, due to a technical limitation they explained in python-poetry/poetry#5401 (comment).

Just keep that in mind, but in any case I agree to open a new issue and talk with upstream to see what solution could we get to. If solved upstream, it'd have obvious benefits upstream too, so I think they'll be open to it if there's a way to do it.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/why-is-it-so-hard-to-use-a-python-package/19200/11

@FRidh
Copy link
Contributor

FRidh commented Jun 9, 2022

I suggest doing the pinning of the build systems here in poetry2nix repo and let it update via actions periodically. It won't consider the version constraints that way, but it gets it working for probably 99% of the cases and is still reproducible.

Note though you will probably need to resolve them together, to ensure they can cooperate when needed. I think it is rare this is going to cause issues, but you don't know. This could be done with poetry itself.

erooke added a commit to erooke/jupyenv that referenced this issue Feb 27, 2023
Poetry does not lock down information about the build system used to
build dependencies. As such poetry2nix cannot automatically figure out
build inputs. nix-community/poetry2nix#568

This documents how to help poetry2nix out with figuring out buildinputs.
djacu pushed a commit to tweag/jupyenv that referenced this issue Mar 4, 2023
* Document ModuleNotFoundError

Poetry does not lock down information about the build system used to
build dependencies. As such poetry2nix cannot automatically figure out
build inputs. nix-community/poetry2nix#568

This documents how to help poetry2nix out with figuring out buildinputs.

* Use path syntax for including the overrides.nix

* Use generic placeholder
@charmoniumQ
Copy link
Contributor

The upstream issue python-poetry/poetry#2789 was fixed, but the fix uses ephemeral build environments, and the information needed to reconstruct them is not recorded in the lock file. Perhaps another upstream issue should be opened explaining why it should be recorded?

See python-poetry/poetry#6154

@colonelpanic8
Copy link

Looks like python-poetry/poetry#7975 was merged which may potentially fix this?

Has anyone tried overriding poetry with a git version to see if this fixes things?

@GuillaumeDesforges
Copy link
Author

This example with entrypoints is not reproduced, probably because of 33db1f3

If anyone would like this issue to progress, we need another new reproducible example that makes sense.

@brendon-boldt
Copy link
Contributor

This pyproject.toml fails for the same reason I believe (urllib3 needs hatchling which it can't find). I'd test out the most recent version of poetry in poetry2nix, but I couldn't figure out how to write that override.

[tool.poetry]
name = "test"
version = "0.0.0"
description = ""
authors = []

[tool.poetry.dependencies]
python = "3.10"
urllib3 = "2.0.4"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

@dmytrokyrychuk
Copy link

@brendon-boldt I think your problem was resolved by 33db1f3, but that version of poetry2nix is not yet in nixpkgs. I tried your pyproject.toml with the latest poetry2nix flake, and urllib3 was installed correctly. See https://github.com/dmytrokyrychuk/nix-urllib3-test

@Turakar
Copy link

Turakar commented Nov 28, 2023

I think that the PyPI cmake package is an example of this:

pyproject.toml:

[tool.poetry]
...

[tool.poetry.dependencies]
python = "^3.11,<3.12"
cmake = "^3.27.7"


[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

flake.nix:

{
  description = "Application packaged using poetry2nix";

  inputs = {
    flake-utils.url = "github:numtide/flake-utils";
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
    poetry2nix = {
      url = "github:nix-community/poetry2nix";
      inputs.nixpkgs.follows = "nixpkgs";
    };
  };

  outputs = { self, nixpkgs, flake-utils, poetry2nix }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        # see https://github.com/nix-community/poetry2nix/tree/master#api for more functions and examples.
        pkgs = nixpkgs.legacyPackages.${system};
        inherit (poetry2nix.lib.mkPoetry2Nix { inherit pkgs; }) mkPoetryApplication;
      in
      {
        packages = {
          myapp = mkPoetryApplication { projectDir = self; };
          default = self.packages.${system}.myapp ;
        };

        devShells.default = pkgs.mkShell {
          inputsFrom = [ self.packages.${system}.myapp ];
          packages = [ pkgs.poetry ];
        };
      });
}

Error:

$ nix develop
warning: Git tree '<path>' is dirty
error: builder for '/nix/store/bn6y3qp4mbxybgbbsz1azzka3d1fqpkw-python3.11-cmake-3.27.7.drv' failed with exit code 2;
       last 10 log lines:
       >   File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
       >   File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
       >   File "<frozen importlib._bootstrap>", line 1126, in _find_and_load_unlocked
       >   File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
       >   File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
       >   File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
       >   File "<frozen importlib._bootstrap>", line 1140, in _find_and_load_unlocked
       > ModuleNotFoundError: No module named 'setuptools'
       > 
       > 
       For full logs, run 'nix log /nix/store/bn6y3qp4mbxybgbbsz1azzka3d1fqpkw-python3.11-cmake-3.27.7.drv'.
error: 1 dependencies of derivation '/nix/store/djms9q6r0nl51c29d54kci9dibihzklx-nix-shell-env.drv' failed to build

@adisbladis
Copy link
Member

See python-poetry/poetry#8752

@adisbladis adisbladis pinned this issue Dec 6, 2023
@GuillaumeDesforges
Copy link
Author

GuillaumeDesforges commented Dec 7, 2023

poetry is not going to cripple its performance just to add data to the lockfile that it does not need

IMO that makes sense, and it is best to figure out a way that does not rely on poetry changing its behavior.

It seems to me that most packages use only a very limited set of packages for build-system.requires.
Could poetry2nix read pyproject.toml and use a mapping from these names in build-system.requires to packages?

E.g.

{ ... }:
let
  # from nixpkgs
  pythonPackages = ...;
  pyprojectToml = ...;
  depToDrv = {
    # For example
    "poetry-core" = pythonPackages.poetry-core;
  };
  buildRequires = map (dep: ) pyprojectToml.build-system.requires;
in
  # add `buildRequires` to the Python build

@adisbladis
Copy link
Member

adisbladis commented Dec 7, 2023

It seems to me that most packages use only a very limited set of packages for build-system.requires.
Could poetry2nix read pyproject.toml and use a mapping from these names in build-system.requires to packages?

That would be IFD, so not possible.

@charmoniumQ
Copy link
Contributor

If I understand the issue correctly, poetry2nix, being a Nix program, can only access the internet if we statically know a hash of what we expect as the response. Much of the time, poetry.lock has the necessary hashes, so poetry2nix can go download transitive (build- or run-time) dependencies of the current project.

However, Poetry does not put build-time dependencies of the transitive (build- or run-time) dependencies into poetry.lock, because Poetry does not necessarily build-from-source (sdist). In fact it prefers to use pre-built wheels for performance reasons. So it would have to do extra work (slow everyone down) to find out and lock the build-time dependencies.

PyPI exposes the hash of the source through an HTTP API, which is how Poetry puts a hash of the source of every dependency in the Poetry.lock even when it builds from source. However, Nix cannot choose its dependencies dynamically; it can't download something to discover the build dependencies; it has to know them before it starts to download anything. Therefore, this doesn't help us that much.


Maybe we should frame our request to Poetry as "please provide a poetry --build-from-source". Does Poetry already know how to build from source if no wheels are available for the platform? Then this shouldn't be too hard to implement. Once it's building from source, it won't be that hard to augment the lockfile (although the spec has to change to include an optional build requirements, if available). This strategy would not hinder any users who don't explicitly enable this option.

@andersk
Copy link
Contributor

andersk commented Dec 8, 2023

If the reason it would be slow for Poetry to add build-system information to poetry.lock is that this information isn’t available via the PyPI HTTP API, that sounds like an argument for making it available via the PyPI HTTP API.

Maybe we should frame our request to Poetry as "please provide a poetry --build-from-source".

That’s already there as the installer.no-binary setting. But it comes with this note:

“As with all configurations described here, this is a user specific configuration. This means that this is not taken into consideration when a lockfile is generated or dependencies are resolved. This is applied only when selecting which distribution for dependency should be installed into a Poetry managed environment.”

@charmoniumQ
Copy link
Contributor

Are there any Python package managers that make a better lockfile? Does PDM do any better?

@SemMulder
Copy link
Contributor

Are there any Python package managers that make a better lockfile? Does PDM do any better?

Apparently, it is quite feasible to roll your own by using the Poetry solver as a library.

See here how conda-lock does it:
https://github.com/conda/conda-lock/blob/4fc441f2f33b9570917d2453d4eadf3cdc1d95f4/conda_lock/pypi_solver.py

That wouldn't be an ideal situation, due to the maintenance burden. But it might be a solution.

@yajo
Copy link
Contributor

yajo commented Dec 11, 2023

How about a poetry2nix.lock file? We could have a simple script here that generates that. Then, add it to your project closure and move on. Of course, to be locked impurely, just like poetry does.

@GuillaumeDesforges
Copy link
Author

GuillaumeDesforges commented Dec 11, 2023

A good thing about poetry2nix was the ability to plug it on any poetry repository, including those which we don't have write permissions. Requiring a file specific to poetry2nix in the source code repos would make poetr2nix much less useful.

@yajo
Copy link
Contributor

yajo commented Dec 12, 2023

I didn't say you need to put that file in the source repository... Since that lock file is only relevant for poetry2nix (and only if you build from sources, not from wheels), then it makes sense that it resides in the same repo as the instructions to build with nix. Example:

{ poetry2nix, python310, fetchgitArgs }:
poetry2nix.mkPoetryApplication rec {
  python = python310;
  pyproject = src + "/pyproject.toml";
  poetrylock = src + "/poetry.lock";
  poetry2nixlock = ./poetry2nix.lock;
  src = builtins.fetchgit fetchgitArgs;
}

@MatrixManAtYrService
Copy link

MatrixManAtYrService commented Dec 27, 2024

The workaround shown above didn't work for me, I think some things have been renamed since then.

I added typer = "^0.15.1" to pyproject.toml, and saw the error:

error: builder for '/nix/store/rpkk037i74csp2hmdmr34bk9zzpx288r-python3.12-click-8.1.8.drv' failed with exit code 2;
       last 25 log lines:
...
       > ModuleNotFoundError: No module named 'flit_core'

(click is a dependency of typer). I ended up fixing it like this:

p2nix = poetry2nix.lib.mkPoetry2Nix { inherit pkgs; };
inherit (p2nix) mkPoetryApplication mkPoetryEditablePackage;
my-app = mkPoetryApplication { 
  projectDir = ./.;
  overrides = p2nix.defaultPoetryOverrides.extend
    (final: prev: {
      click = prev.click.overridePythonAttrs (old: {
        buildInputs = (old.buildInputs or [ ]) ++ [ pkgs.python312Packages.flit-core ];
      });
    });
};

@charmoniumQ
Copy link
Contributor

charmoniumQ commented Feb 28, 2025

How about a poetry2nix.lock file? We could have a simple script here that generates that. Then, add it to your project closure and move on. Of course, to be locked impurely, just like poetry does.
GuillaumeDesforges

A good thing about poetry2nix was the ability to plug it on any poetry repository, including those which we don't have write permissions. Requiring a file specific to poetry2nix in the source code repos would make poetr2nix much less useful.

I didn't say you need to put that file in the source repository... Since that lock file is only relevant for poetry2nix (and only if you build from sources, not from wheels), then it makes sense that it resides in the same repo as the instructions to build with nix.

In other words, the poetry2nix.lock would be contained in the repository of the ultimate user, as opposed to any of the intermediate users. Presumably we would have the ability to write to the ultimate user's repo. E.g., If my ML env uses Numpy which uses Setuptools, Numpy would be an intermediate user, but my ML env would be the ultimate user and would have a poetry2nix.lock file saying "setuptools==1.2.3 @ sha256-deadbeef is a build-system.requires to one of the packages contained in poetry.lock."

poetry2nix.lock would be written by a non-hermetic tool, but once generated (hopefully relatively infrequently) it (along with poetry.lock) enables the rest of the dependencies to be built-from-source hermetically. We could use one of the following approaches:

  • Simple but slow

    1. Run poetry install with no-binary :all: or pip install --no-binary :all:.
      • The downside is that installing includes building, which is much more work than strictly necessary.
    2. Iterate over the packages of the resulting venv with pip freeze. For each package:
      1. Figure out the URL of the sdist corresponding with that package/version.
      2. Download and add the resulting hash to poetry2nix.lock.
      3. We don't have to worry about the dependencies of that package/version because they will have already been solved by poetry install given we set installer.no-binary :all:.
  • Complicated but faster

    1. Write a function that can determine the build dependencies and run-time dependencies of a given package, using PEP 517 or falling back to default behavior for setuptools projects or manual overrides. As I understand, the build dependencies are easier to figure out than the run-time dependencies, because Pip has to be able to figure out who to hand off to. Detecting the run-time dependencies only has to be done for small number of popular build-time dependencies (those resulting from Step 3 or Step 4.b.b).
    2. Create a set called unprocessed-build-deps and a DAG called processed-build-deps.
    3. For each dependency of the ultimate user, figure out its build-time package-constraints (ignoring run-time dependencies, as those are managed by Poetry in poetry.lock), adding those to unprocessed-build-deps. When I say, "package-constraint" I mean the pair of (package name, a set of constraints on the versions of that package).
    4. While the unprocessed-build-deps is not empty,
      1. Pop a package constraint from unprocessed-build-deps (essentially, a tree traversal on the dep graph).
      2. For all versions (findable from PyPI's API) that satisfy all previous known constraints on that package,
        1. We are not handling transitive constraints here, those will be handled in Step 5. This is just to get all the possible options, so over-approximation is acceptable.
        2. Figure that packages build- and run-time dependency constraints. Run-time dependency of a build-time dependency of the ultimate-user is a build-time dependency of the ultimate-user, similar to the logic for Nixpkgs dependency propagation.
        3. Add the unseen deps (those not already in processed-build-deps) to unprocessed-build-deps.
        4. Add an edge from the current package/version to its build- and run-time dependencies (as an edge from the current package/version from 4.2).
    5. Solve the constraints in the set, probably farming out to an external lib like libmamba, libsolv, or whatever Pip and Poetry use here.

Also note that the complicated way would be a more general tool, so it may have more community support. Anyone who wants to construct a "dependency graph of Python packages" (researchers, security folks, other package managers). It's basically "do pip freeze without first doing pip install".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests