Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent versions of orbax-checkpoint fail during import #2605

Closed
jimmyt857 opened this issue Feb 9, 2025 · 3 comments
Closed

Recent versions of orbax-checkpoint fail during import #2605

jimmyt857 opened this issue Feb 9, 2025 · 3 comments

Comments

@jimmyt857
Copy link

jimmyt857 commented Feb 9, 2025

🐞 bug report

Affected Rule

py_library, py_binary, py_test

Is this a regression?

No

Description

We have a monorepo using rules_python. During an attempt to bump some of our pypi package versions, many binaries started failing with the following error:

Traceback (most recent call last):
  File "/home/jimmy/.cache/bazel/_bazel_jimmy/d34f0b413aad92f69f3e71c957bed2d2/execroot/_main/bazel-out/k8-fastbuild/bin/main.runfiles/_main/main.py", line 1, in <module>
    import orbax.checkpoint
    ^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'orbax.checkpoint'

Importantly, when we run our binaries without Bazel (we're in the process of migrating, so we still have both options) there is no error. So this issue is specific to bazel and rules_python.

For context, we have had issues with the orbax package in the past, it seems to do some weird things with syspath/modules and my guess is that there is a conflict with the way rules_python sets up the virtual environments.

🔬 Minimal Reproduction

.bazelversion:

7.4.0

MODULE.bazel:

module(name = "my_module")

bazel_dep(name = "aspect_rules_py", version = "1.3.1")
bazel_dep(name = "rules_python", version = "1.1.0")
bazel_dep(name = "rules_uv", version = "0.53.0")

pip = use_extension("@rules_python//python/extensions:pip.bzl", "pip")
pip.parse(
    hub_name = "pip",
    python_version = "3.11",
    requirements_lock = "//:requirements.txt",
)
use_repo(pip, "pip")

BUILD.bazel:

# Note: the issue presents in both the rules_py and aspect_rules_py versions.
# load("@aspect_rules_py//py:defs.bzl", "py_binary")
load("@rules_python//python:py_binary.bzl", "py_binary")
load("@rules_uv//uv:pip.bzl", "pip_compile")

pip_compile(
    name = "generate_requirements_txt",
    requirements_in = "//:requirements.in",
    requirements_txt = "//:requirements.txt",
)

py_binary(
    name = "main",
    srcs = ["main.py"],
    visibility = ["//visibility:public"],
    deps = ["@pip//orbax_checkpoint"],
)

requirements.in:

orbax-checkpoint==0.11.3

main.py:

import orbax.checkpoint  # noqa: F401

🌍 Your Environment

Operating System:

Ubuntu 22.04

Output of bazel version:

Bazelisk version: v1.25.0
Build label: 7.4.0
Build target: @@//src/main/java/com/google/devtools/build/lib/bazel:BazelServer
Build time: Tue Oct 22 17:24:25 2024 (1729617865)
Build timestamp: 1729617865
Build timestamp as int: 1729617865

Rules_python version:
1.1.0

Anything else relevant?
The lastest working version of orbax-checkpoint is 0.10.1. Versions 0.11.1, 0.11.2, 0.11.3, and 0.11.4 have the same error, and versions 0.10.2, 0.10.3, and 0.11.0 also fail but with a slightly different error:

Traceback (most recent call last):
  File "/home/jimmy/.cache/bazel/_bazel_jimmy/d34f0b413aad92f69f3e71c957bed2d2/execroot/_main/bazel-out/k8-fastbuild/bin/main.runfiles/_main/main.py", line 1, in <module>
    import orbax.checkpoint  # noqa: F401
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jimmy/.cache/bazel/_bazel_jimmy/d34f0b413aad92f69f3e71c957bed2d2/execroot/_main/bazel-out/k8-fastbuild/bin/main.runfiles/rules_python~~pip~pip_311_orbax_checkpoint/site-packages/orbax/checkpoint/__init__.py", line 23, in <module>
    from orbax.checkpoint import aggregate_handlers
  File "/home/jimmy/.cache/bazel/_bazel_jimmy/d34f0b413aad92f69f3e71c957bed2d2/execroot/_main/bazel-out/k8-fastbuild/bin/main.runfiles/rules_python~~pip~pip_311_orbax_checkpoint/site-packages/orbax/checkpoint/aggregate_handlers.py", line 25, in <module>
    from orbax.checkpoint import utils
  File "/home/jimmy/.cache/bazel/_bazel_jimmy/d34f0b413aad92f69f3e71c957bed2d2/execroot/_main/bazel-out/k8-fastbuild/bin/main.runfiles/rules_python~~pip~pip_311_orbax_checkpoint/site-packages/orbax/checkpoint/utils.py", line 30, in <module>
    from orbax.checkpoint._src.path import async_utils
ModuleNotFoundError: No module named 'orbax.checkpoint._src.path'

Commands to run the MRE successfully (without error) outside of Bazel:

uv venv --python 3.11.9
source .venv/bin/activate
uv pip sync requirements.txt
uv run main.py
@jimmyt857 jimmyt857 changed the title Recent versions of orbax-checkpoint fail at runtime Recent versions of orbax-checkpoint fail during import Feb 9, 2025
@aignas
Copy link
Collaborator

aignas commented Feb 10, 2025

This is a duplicate of #2156, but the symptoms seem to relate a lot to what has been discussed in https://bazelbuild.slack.com/archives/CA306CEV6/p1738883357196659.

The context of the orbax-checkpoint are: https://bazelbuild.slack.com/archives/CA306CEV6/p1738883357196659

I am not sure what other packages you have, but my suggestion would be to patch the wheel or contribute to #2156.

@aignas aignas closed this as completed Feb 10, 2025
@jimmyt857
Copy link
Author

jimmyt857 commented Feb 10, 2025

Thanks for the fast reply! We do already use aspect_rules_py, which is described as a workaround to #2156. It also fails with the same error, as noted above although re-reading my post I see that could definitely have been made clearer.

Your response sent me on another attempt at investigating though, and I discovered the real problem; google/orbax#1429. They're publishing BUILD files in their wheel.

@aignas
Copy link
Collaborator

aignas commented Feb 10, 2025

As a workaround you could carry a patch to the wheel to remove the BUILD files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants