chore: AUT-673 Update Docker image version to 26.06-py3 by svcnemo-autobot · Pull Request #5622 · NVIDIA/Megatron-LM

svcnemo-autobot · 2026-07-02T10:55:47Z

Bumped the dev NGC PyTorch base image from 26.04 to 26.06 in both CI pin sites — docker/.ngc_version.dev and the two IMAGE_TYPE:dev BASE_IMAGE rows (amd64+arm64) in .gitlab/stages/01.build.yml; left the LTS pin (25.09) untouched per the bump-base-image skill. Assumed dev-only scope since the request didn't mention LTS; note golden-value drift may require a follow-up refresh once functional CI runs.

Bump the dev NGC PyTorch base image to 26.06 in both CI pin sites (GitHub docker/.ngc_version.dev and the GitLab dev BASE_IMAGE rows).

copy-pr-bot · 2026-07-02T10:55:51Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2026-07-02T10:55:58Z

This PR has been automatically converted to draft because all PRs must start as drafts.

When you are ready for review, click Ready for Review to begin the review process. This will:

Add the oncall reviewer (optional reviewer)
Add required review teams based on your changes

See the contribution guide for more details.

svcnemo-autobot · 2026-07-02T10:56:38Z

/ok to test e12c03f

Pin the transformer-engine git source to the v2.15 release tag per review feedback on the 26.06 base-image bump.

Match the vendored uv dependency-metadata version to the v2.15 pin.

svcnemo-autobot · 2026-07-02T11:05:20Z

/ok to test b4994eb

NGC PyTorch 26.06 ships torch 2.13.0a0, whose DTensor sharding propagation no longer supports the in-place fused `aten._foreach_lerp_` (torch.optim.Adam moment update) on Replicate-placed DTensors, raising "in-place operations that require placement changes are not supported". This is a torch-side regression from the base-image bump, not a Megatron-FSDP bug; skip the affected combinatorial test until upstream fixes it. Tracking issue to be filed by maintainers.

Skip the combinatorial test_fully_shard on torch 2.13+ where the DTensor in-place _foreach_lerp_ regression breaks the Adam optimizer step.

svcnemo-autobot · 2026-07-02T12:41:48Z

/ok to test 28b82ef

chore: Update Docker image version to 26.06-py3

e12c03f

Bump the dev NGC PyTorch base image to 26.06 in both CI pin sites (GitHub docker/.ngc_version.dev and the GitLab dev BASE_IMAGE rows).

svcnemo-autobot requested a review from a team as a code owner July 2, 2026 10:55

svcnvidia-nemo-ci marked this pull request as draft July 2, 2026 10:55

svcnemo-autobot added the ci label Jul 2, 2026

copy-pr-bot Bot temporarily deployed to public July 2, 2026 10:57 Inactive

copy-pr-bot Bot had a problem deploying to test July 2, 2026 10:57 Error

nemo-autobot-origin Bot added 3 commits July 2, 2026 10:58

chore: Downgrade transformer-engine to 2.15

0d4b566

Pin the transformer-engine git source to the v2.15 release tag per review feedback on the 26.06 base-image bump.

chore: Update transformer-engine dependency-metadata version to 2.15

bb9ab0c

Match the vendored uv dependency-metadata version to the v2.15 pin.

chore: regenerate uv.lock for ci/implement-f806ad04a816

b4994eb

copy-pr-bot Bot temporarily deployed to public July 2, 2026 11:06 Inactive

copy-pr-bot Bot temporarily deployed to test July 2, 2026 11:06 Inactive

copy-pr-bot Bot temporarily deployed to public July 2, 2026 11:10 Inactive

copy-pr-bot Bot temporarily deployed to public July 2, 2026 11:18 Inactive

nemo-autobot-origin Bot added 2 commits July 2, 2026 12:39

test: apply torch>=2.13 skip guard to mfsdp test_fully_shard

28b82ef

Skip the combinatorial test_fully_shard on torch 2.13+ where the DTensor in-place _foreach_lerp_ regression breaks the Adam optimizer step.

copy-pr-bot Bot temporarily deployed to public July 2, 2026 12:42 Inactive

copy-pr-bot Bot temporarily deployed to test July 2, 2026 12:42 Inactive

copy-pr-bot Bot temporarily deployed to public July 2, 2026 12:45 Inactive

copy-pr-bot Bot temporarily deployed to public July 2, 2026 12:46 Inactive

copy-pr-bot Bot temporarily deployed to public July 2, 2026 12:54 Inactive

balasaajay added the Run functional tests label Jul 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: AUT-673 Update Docker image version to 26.06-py3#5622

chore: AUT-673 Update Docker image version to 26.06-py3#5622
svcnemo-autobot wants to merge 6 commits into
NVIDIA:mainfrom
svcnemo-autobot:ci/implement-f806ad04a816

svcnemo-autobot commented Jul 2, 2026

Uh oh!

copy-pr-bot Bot commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

svcnemo-autobot commented Jul 2, 2026

Uh oh!

svcnemo-autobot commented Jul 2, 2026

Uh oh!

svcnemo-autobot commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

svcnemo-autobot commented Jul 2, 2026

Uh oh!

copy-pr-bot Bot commented Jul 2, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

svcnemo-autobot commented Jul 2, 2026

Uh oh!

svcnemo-autobot commented Jul 2, 2026

Uh oh!

svcnemo-autobot commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants