Skip to content

feat(xgboost): add parallel support for XGBoost 3.0.5 alongside 3.2.0#6174

Open
Jyothirmaikottu wants to merge 1 commit into
mainfrom
xgboost-305-support
Open

feat(xgboost): add parallel support for XGBoost 3.0.5 alongside 3.2.0#6174
Jyothirmaikottu wants to merge 1 commit into
mainfrom
xgboost-305-support

Conversation

@Jyothirmaikottu
Copy link
Copy Markdown
Contributor

Add infrastructure to build, test, and release XGBoost 3.0.5 in parallel with 3.2.0. This enables continued CVE patching and maintenance of the 3.0.5 image while the 3.2.0 image is the primary release.

Changes:

  • Add docker/xgboost/3.0-5/ with Dockerfile and requirements.txt restored from the last 3.0.5 state (commit fa20102), pinned to sagemaker-xgboost-container release-3.0.5 branch
  • Add .github/config/image/sagemaker-xgboost-3.0.yml config
  • Add .github/workflows/dispatch-release-sagemaker-xgboost-3.0.yml
  • Update reusable integ tests workflow with --xgboost-version input
  • Update PR workflow to detect 3.2.0 vs 3.0.5 vs shared resource changes
  • Version-gate tests with --xgboost-version pytest flag:
    • Pipe mode tests: restored, skipped on >= 3.2.0
    • GPU tree_method: dynamic via gpu_tree_method fixture
    • xfail markers: gated to 3.2.0 only
    • generate_models.py: pickle vs save_model via XGBOOST_VERSION env var
  • Fix prod_image tag format in 3.2.0 config

Purpose

Test Plan

Test Result


Toggle if you are merging into master Branch

By default, docker image builds and tests are disabled. Two ways to run builds and tests:

  1. Using dlc_developer_config.toml
  2. Using this PR description (currently only supported for PyTorch, TensorFlow, vllm, and base images)
How to use the helper utility for updating dlc_developer_config.toml

Assuming your remote is called origin (you can find out more with git remote -v)...

  • Run default builds and tests for a particular buildspec - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -cp origin

  • Enable specific tests for a buildspec or set of buildspecs - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -t sanity_tests -cp origin

  • Restore TOML file when ready to merge

python src/prepare_dlc_dev_environment.py -rcp origin

NOTE: If you are creating a PR for a new framework version, please ensure success of the local, standard, rc, and efa sagemaker tests by updating the dlc_developer_config.toml file:

  • sagemaker_remote_tests = true
  • sagemaker_efa_tests = true
  • sagemaker_rc_tests = true
  • sagemaker_local_tests = true
How to use PR description Use the code block below to uncomment commands and run the PR CodeBuild jobs. There are two commands available:
  • # /buildspec <buildspec_path>
    • e.g.: # /buildspec pytorch/training/buildspec.yml
    • If this line is commented out, dlc_developer_config.toml will be used.
  • # /tests <test_list>
    • e.g.: # /tests sanity security ec2
    • If this line is commented out, it will run the default set of tests (same as the defaults in dlc_developer_config.toml): sanity, security, ec2, ecs, eks, sagemaker, sagemaker-local.
# /buildspec <buildspec_path>
# /tests <test_list>
Toggle if you are merging into main Branch

PR Checklist

  • [] I ran pre-commit run --all-files locally before creating this PR. (Read DEVELOPMENT.md for details).

@Jyothirmaikottu Jyothirmaikottu force-pushed the xgboost-305-support branch 8 times, most recently from e666b98 to c0b7e15 Compare June 2, 2026 00:15
Add infrastructure to build, test, and release XGBoost 3.0.5 in parallel
with 3.2.0. This enables continued CVE patching and maintenance of the
3.0.5 image while the 3.2.0 image is the primary release.

Changes:
- Add docker/xgboost/3.0-5/ with Dockerfile and requirements.txt
  restored from the last 3.0.5 state (commit fa20102), pinned to
  sagemaker-xgboost-container release-3.0.5 branch
- Add .github/config/image/sagemaker-xgboost-3.0.yml config
- Add .github/workflows/dispatch-release-sagemaker-xgboost-3.0.yml
- Update reusable integ tests workflow with --xgboost-version input
- Update PR workflow to detect 3.2.0 vs 3.0.5 vs shared resource changes
- Version-gate tests with --xgboost-version pytest flag:
  - Pipe mode tests: restored, skipped on >= 3.2.0
  - GPU tree_method: dynamic via gpu_tree_method fixture
  - xfail markers: gated to 3.2.0 only
  - generate_models.py: pickle vs save_model via XGBOOST_VERSION env var
- Fix prod_image tag format in 3.2.0 config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant