[ROCm][CI] Fixes tests for pytorch nightly and python only builds #28979

AndreasKaratzas · 2025-11-19T01:26:03Z

This PR fixes tests for labels:

Pytorch Nightly Dependency Override Check
Python-only Installation Test

It also includes torchaudio package into Dockerfile.

- Fixed setup.py logging format (G001) - Added ROCm support for precompiled wheels - Improved Dockerfile.rocm test stage with hf_transfer and v1 package - Fixed pytorch_nightly_dependency.sh to detect ROCm version correctly - Added source code directory for python_only_compile.sh test Signed-off-by: Andreas Karatzas <[email protected]>

Signed-off-by: Andreas Karatzas <[email protected]>

gemini-code-assist

Code Review

This pull request introduces fixes for PyTorch nightly and Python-only build tests on the ROCm platform, and adds torchaudio to the ROCm Docker image. The changes in setup.py and the test scripts are well-implemented, improving the robustness of the build process. My review focuses on optimizing the Dockerfiles. I've suggested combining several RUN instructions to reduce image layers, which is a best practice that improves build times and reduces image size. I've also recommended adding apt-get update and cache cleanup to an apt-get install command to ensure correctness and efficiency.

gemini-code-assist · 2025-11-19T01:28:21Z

docker/Dockerfile.rocm

+RUN mkdir src
+RUN mv vllm src/vllm


To optimize the Docker image size and reduce the number of layers, it's a best practice to chain related commands into a single RUN instruction. The mkdir and mv commands can be combined.

RUN mkdir src && mv vllm src/vllm

gemini-code-assist · 2025-11-19T01:28:21Z

docker/Dockerfile.rocm_base

    && python3 --version && python3 -m pip --version

 RUN pip install -U packaging 'cmake<4' ninja wheel 'setuptools<80' pybind11 Cython
+RUN apt-get install -y libjpeg-dev libsox-dev libsox-fmt-all sox


For correctness and to keep the Docker image size minimal, it's crucial to run apt-get update before apt-get install and to clean up the apt cache in the same RUN layer. This prevents issues with stale package lists and removes unnecessary files.

RUN apt-get update && apt-get install -y libjpeg-dev libsox-dev libsox-fmt-all sox && rm -rf /var/lib/apt/lists/*

gemini-code-assist · 2025-11-19T01:28:21Z

docker/Dockerfile.rocm_base

+RUN git clone ${PYTORCH_AUDIO_REPO} audio
+RUN cd audio && git checkout ${PYTORCH_AUDIO_BRANCH} \
+    && git submodule update --init --recursive \
+    && pip install -r requirements.txt \
+    && python3 setup.py bdist_wheel --dist-dir=dist \
+    && pip install dist/*.whl


To reduce the number of Docker image layers and improve build efficiency, it's recommended to combine the git clone and the subsequent build commands for torchaudio into a single RUN instruction.

RUN git clone ${PYTORCH_AUDIO_REPO} audio && cd audio \ && git checkout ${PYTORCH_AUDIO_BRANCH} \ && git submodule update --init --recursive \ && pip install -r requirements.txt \ && python3 setup.py bdist_wheel --dist-dir=dist \ && pip install dist/*.whl

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-11-19T01:33:36Z

setup.py

 # If using precompiled, extract and patch package_data (in advance of setup)
 if envs.VLLM_USE_PRECOMPILED:
-    assert _is_cuda(), "VLLM_USE_PRECOMPILED is only supported for CUDA builds"
+    assert _is_cuda() or _is_hip(), (
+        "VLLM_USE_PRECOMPILED is only supported for CUDA or ROCm builds."


Allowing precompiled mode on ROCm installs CUDA wheel

The precompiled path now explicitly allows _is_hip() (ROCm) builds, but the precompiled wheel download below still always targets the CUDA binary (wheels.vllm.ai/...vllm-1.0.0.dev-cp38-abi3-<arch>.whl, no ROCm variant). On ROCm agents where VLLM_USE_PRECOMPILED=1 is used (e.g., python_only_compile.sh in the AMD pipeline), this will install CUDA .so files without the CUDA runtime present, causing import-time failures instead of a clean skip. Either keep the CUDA-only guard or fetch a ROCm-specific wheel before enabling this path on ROCm.

Useful? React with 👍 / 👎.

AndreasKaratzas added 4 commits November 18, 2025 01:39

Merge remote-tracking branch 'upstream/main' into akaratza_ci

65e6376

[ROCm][CI] fix for pytorch/pytorch standalone tests

6364378

Signed-off-by: Andreas Karatzas <[email protected]>

Merge upstream/main into akaratza_ci

d3ff04b

AndreasKaratzas requested review from gshtras and tjtanaa as code owners November 19, 2025 01:26

mergify bot added ci/build rocm Related to AMD ROCm labels Nov 19, 2025

gemini-code-assist bot reviewed Nov 19, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Nov 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[ROCm][CI] Fixes tests for pytorch nightly and python only builds #28979

[ROCm][CI] Fixes tests for pytorch nightly and python only builds #28979

AndreasKaratzas commented Nov 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 19, 2025

Uh oh!

gemini-code-assist bot Nov 19, 2025

Uh oh!

gemini-code-assist bot Nov 19, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[ROCm][CI] Fixes tests for pytorch nightly and python only builds #28979

Are you sure you want to change the base?

[ROCm][CI] Fixes tests for pytorch nightly and python only builds #28979

Conversation

AndreasKaratzas commented Nov 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AndreasKaratzas commented Nov 19, 2025 •

edited by github-actions bot

Loading