[Bug]: Running same binary in container build with `py_image_layer` causes failed imports #526

ianb-pomelo · 2025-02-06T17:39:52Z

What happened?

We recently migrated from version 0.7.1 to 1.2.1 and migrated the way we build our docker images from using a modified version of the old template to py_image_layer. Overall it has been great except for one thing: we deploy our containers to K8S and have health/status checks on them. The issue is the health checks use the same binary that the image runs normally, just in different modes. Concretely, we are running Dagster which has the container run dagster api grpc and then the health checks use dagster api grpc-health-check. What we found is since the same binary target is being run by two separate cases in the same container, the venv that backed the python script was being re-created each health check run. This caused the base process to lose the packages in it's venv temporarily, causing it to be unhealthy and thus fail.

It seems like #522 would fix this since the venv would be stable but in the meantime, is there a way to fix this temporarily?

Version

Development (host) and target OS/architectures: aarch Darwin -> aarch Darwin, Linux x86_64 -> Linux x86_64

Output of bazel --version: 8.0.0

Version of the Aspect rules, or other relevant rules from your
WORKSPACE or MODULE.bazel file: 1.2.1

Language(s) and/or frameworks involved: Python 3.11, Docker/rules_oci 1.7.4

How to reproduce

Hard to reliably reproduce since it is a bit of a race condition.

One way would be to have a python binary that continually tries to import a package and create an OCI image using `py_image_layer`. Then run the image and then `exec` the binary again in another window. One of the two should error out but may take several iterations

Any other information?

We were getting errors that looked like

ERROR 2025-02-05T17:31:44.082500645Z [resource.labels.containerName: dagster-user-deployments] File "/data/pomelo/dagster.runfiles/.dagster.venv/lib/python3.11/site-packages/dsp/modules/__init__.py", line 22, in <module>
ERROR 2025-02-05T17:31:44.082966540Z [resource.labels.containerName: dagster-user-deployments] from .pyserini import *
ERROR 2025-02-05T17:31:44.082988358Z [resource.labels.containerName: dagster-user-deployments] File "/data/pomelo/dagster.runfiles/.dagster.venv/lib/python3.11/site-packages/dsp/modules/pyserini.py", line 4, in <module>
ERROR 2025-02-05T17:31:44.083429066Z [resource.labels.containerName: dagster-user-deployments] from datasets import Dataset
ERROR 2025-02-05T17:31:44.083460Z [resource.labels.containerName: dagster-user-deployments] ModuleNotFoundError: No module named 'datasets'

despite having datasets included in the binary. After turning off our health checks, the error went away. I also SSH'd into the pod and inspected the packages in the venv that was generated and saw it would repeatedly have a subset of the expected packages and then quickly after have all of the expected packages

The text was updated successfully, but these errors were encountered:

arrdem · 2025-05-02T18:49:54Z

Will be obviated by #551.

ianb-pomelo added the bug Something isn't working label Feb 6, 2025

arrdem self-assigned this May 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Running same binary in container build with `py_image_layer` causes failed imports #526

[Bug]: Running same binary in container build with `py_image_layer` causes failed imports #526

ianb-pomelo commented Feb 6, 2025

arrdem commented May 2, 2025

[Bug]: Running same binary in container build with py_image_layer causes failed imports #526

[Bug]: Running same binary in container build with py_image_layer causes failed imports #526

Comments

ianb-pomelo commented Feb 6, 2025

What happened?

Version

How to reproduce

Any other information?

arrdem commented May 2, 2025

[Bug]: Running same binary in container build with `py_image_layer` causes failed imports #526

[Bug]: Running same binary in container build with `py_image_layer` causes failed imports #526