Skip to content

Commit 6a4e165

Browse files
rsareddy0329rohangujarathiRohan GujarathiNarrohagJulfried
authored
Add sagemaker_utils module (#1667)
* prepare release v2.241.0 * update development version to v2.241.1.dev0 * pipeline definition function doc update (#5074) Co-authored-by: Rohan Gujarathi <[email protected]> * feat: add integ tests for training JumpStart models in private hub (#5076) * feat: add integ tests for training JumpStart models in private hub * fixed formatting * remove unused imports * fix unused imports * fix unit test failure and fix bug around versioning * fix formatting * fix unit tests * fix model_uri usage issue * fix some formatting * separate private hub setup code * add try catch block * fix flake8 issue so except clause is not bare * black formatting * fix: resolve infinite loop in _find_config on Windows systems (#4970) * fix: resolve Windows path handling in _find_config * Replace Path.match("/") with Path.anchor comparison * Fix infinite loop in _studio.py path traversal * test: Add tests for the new root path exploration * Fix formatting style * Fixed line to long * Fix docstyle by running black manually * Fix testcase with \\ when running on non-windows machines * Fix formatting style * cleanup unused import * change: update image_uri_configs 03-11-2025 07:18:09 PST * Fixing Pytorch training python version in tests (#5084) * Fixing Pytorch training python version in tests * Updating Inference test handling * remove s3 output location requirement from hub class init (#5081) * remove s3 output location requirement from hub class init * fix integ test hub * lint * fix test --------- Co-authored-by: Gokul Anantha Narayanan <[email protected]> * fix: Prevent RunContext overlap between test_run tests (#5083) Co-authored-by: Gokul Anantha Narayanan <[email protected]> * Torch upgrade (#5086) * Fix Flake8 Violations * UPDATE PYTORCH VERSION TO ADDRESS SECURITY RISK **Description** Currently used Pytorch version has a possible vulnerability . Internal - https://tiny.amazon.com/p5i4jla1 **Testing Done** Unit and Integration tests in the CodeBuild * REvert CPU Versions * Test Fix * Codestyle fixes * debug attempt * Fixes * Fix * Fix * prepare release v2.242.0 * update development version to v2.242.1.dev0 * add new regions to JUMPSTART_LAUNCHED_REGIONS (#5089) Co-authored-by: isha chidrawar <[email protected]> Co-authored-by: Gokul Anantha Narayanan <[email protected]> * ADD Documentation to ReadtheDocs for Upgrading torch versions (#5090) * ADD Documentation to ReadtheDocs for Upgrading torch versions **Description** **Testing Done** Only documentation updates * Fix for Codestyle * Remove unused import * Flake8 Fix * CodeStyle Fixes * feature: Enabled update_endpoint through model_builder (#5085) * feature: Enabled update_endpoint through model_builder * fix: fix unit test, black-check, pylint errors * fix: fix black-check, pylint errors --------- Co-authored-by: Roja Reddy Sareddy <[email protected]> * fix: factor in set instance type when building JumpStart models in ModelBuilder. (#5093) * Remove main function entrypoint in ModelBuilder dependency manager. * Remove main function entrypoint in ModelBuilder dependency manager. * fix: factor in set instance type when building JumpStart models in ModelBuilder. * Remove default instance type from ModelBuilder. * Restore default instance type. Tweak integ test. --------- Co-authored-by: Joseph Zhang <[email protected]> * change: update image_uri_configs 03-21-2025 07:17:55 PST * Skip tests failed due to deprecated instance type (#5097) Co-authored-by: pintaoz <[email protected]> * Feat: Added support for returing most recently created approved model package in a group (#5092) Co-authored-by: Keshav Chandak <[email protected]> * change: update image_uri_configs 03-25-2025 07:18:13 PST * chore: fix integ tests to use latest version of model (#5104) * change: update image_uri_configs 03-26-2025 07:18:16 PST * Update Jinja version (#5101) * Aligned disable_output_compression for @Remote with Estimator (#5094) * Update transformers version (#5102) * fix: use temp file in unit tests (#5106) * fix: fix flaky spark processor integ (#5109) * fix: fix flaky spark processor integ * format * fix: fix flaky clarify model monitor test (#5107) * chore: move jumpstart region definitions to json file (#5095) * chore: move jumpstart region definitions to json file * chore: address formatting issues * fix: neo regions not ga in 5 regions * chore: make variable private --------- Co-authored-by: Erick Benitez-Ramos <[email protected]> * change: Update for PT 2.5.1, SMP 2.8.0 (#5071) * Added the config utils - centralized module for managing config file utils * Add image_uris to sagemaker utils * Updated python version * Added other common utils in pysdk to this centralized module * Test Intelligent Defaults with new utils module --------- Co-authored-by: ci <ci> Co-authored-by: Rohan Gujarathi <[email protected]> Co-authored-by: Rohan Gujarathi <[email protected]> Co-authored-by: Rohan Narayan <[email protected]> Co-authored-by: Julian Grimm <[email protected]> Co-authored-by: sagemaker-bot <[email protected]> Co-authored-by: Gokul Anantha Narayanan <[email protected]> Co-authored-by: Ben Crabtree <[email protected]> Co-authored-by: rrrkharse <[email protected]> Co-authored-by: IshaChid76 <[email protected]> Co-authored-by: isha chidrawar <[email protected]> Co-authored-by: Roja Reddy Sareddy <[email protected]> Co-authored-by: cj-zhang <[email protected]> Co-authored-by: Joseph Zhang <[email protected]> Co-authored-by: pintaoz-aws <[email protected]> Co-authored-by: pintaoz <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Bruno Pistone <[email protected]> Co-authored-by: evakravi <[email protected]> Co-authored-by: Victor Zhu <[email protected]>
1 parent 8326c23 commit 6a4e165

File tree

1,729 files changed

+251458
-944
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,729 files changed

+251458
-944
lines changed

.gitignore

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,9 @@ env/
3232
.python-version
3333
*.html
3434
**/_repack_script_launcher.sh
35-
src/sagemaker/modules/train/container_drivers/sm_train.sh
36-
src/sagemaker/modules/train/container_drivers/sourcecode.json
37-
src/sagemaker/modules/train/container_drivers/distributed.json
35+
legacy/src/sagemaker/modules/train/container_drivers/sm_train.sh
36+
legacy/src/sagemaker/modules/train/container_drivers/sourcecode.json
37+
legacy/src/sagemaker/modules/train/container_drivers/distributed.json
3838
tests/data/**/_repack_model.py
3939
tests/data/experiment/sagemaker-dev-1.0.tar.gz
40-
src/sagemaker/serve/tmp_workspace
40+
legacy/src/sagemaker/serve/tmp_workspace

CHANGELOG.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,38 @@
11
# Changelog
22

3+
## v2.242.0 (2025-03-14)
4+
5+
### Features
6+
7+
* add integ tests for training JumpStart models in private hub
8+
9+
### Bug Fixes and Other Changes
10+
11+
* Torch upgrade
12+
* Prevent RunContext overlap between test_run tests
13+
* remove s3 output location requirement from hub class init
14+
* Fixing Pytorch training python version in tests
15+
* update image_uri_configs 03-11-2025 07:18:09 PST
16+
* resolve infinite loop in _find_config on Windows systems
17+
* pipeline definition function doc update
18+
19+
## v2.241.0 (2025-03-06)
20+
21+
### Features
22+
23+
* Make DistributedConfig Extensible
24+
* support training for JumpStart model references as part of Curated Hub Phase 2
25+
* Allow ModelTrainer to accept hyperparameters file
26+
27+
### Bug Fixes and Other Changes
28+
29+
* Skip tests with deprecated instance type
30+
* Ensure Model.is_repack() returns a boolean
31+
* Fix error when there is no session to call _create_model_request()
32+
* Use sagemaker session's s3_resource in download_folder
33+
* Added check for the presence of model package group before creating one
34+
* Fix key error in _send_metrics()
35+
336
## v2.240.0 (2025-02-25)
437

538
### Features

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.240.1.dev0
1+
2.242.1.dev0

doc/overview.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,11 @@ To train a model by using the SageMaker Python SDK, you:
3030

3131
After you train a model, you can save it, and then serve the model as an endpoint to get real-time inferences or get inferences for an entire dataset by using batch transform.
3232

33+
34+
Important Note:
35+
36+
* When using torch to load Models, it is recommended to use version torch>=2.6.0 and torchvision>=0.17.0
37+
3338
Prepare a Training script
3439
=========================
3540

doc/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ sphinx==5.1.1
22
sphinx-rtd-theme==0.5.0
33
docutils==0.15.2
44
packaging==20.9
5-
jinja2==3.1.4
5+
jinja2==3.1.6
66
schema==0.7.5
77
accelerate>=0.24.1,<=0.27.0
88
graphene<4.0
File renamed without changes.
File renamed without changes.

legacy/src/sagemaker/_studio.py

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License"). You
4+
# may not use this file except in compliance with the License. A copy of
5+
# the License is located at
6+
#
7+
# http://aws.amazon.com/apache2.0/
8+
#
9+
# or in the "license" file accompanying this file. This file is
10+
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
11+
# ANY KIND, either express or implied. See the License for the specific
12+
# language governing permissions and limitations under the License.
13+
"""Provides internal tooling for studio environments."""
14+
from __future__ import absolute_import
15+
16+
import json
17+
import logging
18+
19+
from pathlib import Path
20+
21+
STUDIO_PROJECT_CONFIG = ".sagemaker-code-config"
22+
23+
logger = logging.getLogger(__name__)
24+
25+
26+
def _append_project_tags(tags=None, working_dir=None):
27+
"""Appends the project tag to the list of tags, if it exists.
28+
29+
Args:
30+
working_dir: the working directory to start looking.
31+
tags: the list of tags to append to.
32+
33+
Returns:
34+
A possibly extended list of tags that includes the project id.
35+
"""
36+
path = _find_config(working_dir)
37+
if path is None:
38+
return tags
39+
40+
config = _load_config(path)
41+
if config is None:
42+
return tags
43+
44+
additional_tags = _parse_tags(config)
45+
if additional_tags is None:
46+
return tags
47+
48+
all_tags = tags or []
49+
additional_tags = [tag for tag in additional_tags if tag not in all_tags]
50+
all_tags.extend(additional_tags)
51+
52+
return all_tags
53+
54+
55+
def _find_config(working_dir=None):
56+
"""Gets project config on SageMaker Studio platforms, if it exists.
57+
58+
Args:
59+
working_dir: the working directory to start looking.
60+
61+
Returns:
62+
The project config path, if it exists. Otherwise None.
63+
"""
64+
try:
65+
wd = Path(working_dir) if working_dir else Path.cwd()
66+
67+
path = None
68+
69+
# Get the root of the current working directory for both Windows and Unix-like systems
70+
root = Path(wd.anchor)
71+
while path is None and wd != root:
72+
candidate = wd / STUDIO_PROJECT_CONFIG
73+
if Path.exists(candidate):
74+
path = candidate
75+
wd = wd.parent
76+
77+
return path
78+
except Exception as e: # pylint: disable=W0703
79+
logger.debug("Could not find the studio project config. %s", e)
80+
81+
82+
def _load_config(path):
83+
"""Parse out the projectId attribute if it exists at path.
84+
85+
Args:
86+
path: path to project config
87+
88+
Returns:
89+
Project config Json, or None if it does not exist.
90+
"""
91+
try:
92+
with open(path, "r") as f:
93+
content = f.read().strip()
94+
config = json.loads(content)
95+
96+
return config
97+
except Exception as e: # pylint: disable=W0703
98+
logger.debug("Could not load project config. %s", e)
99+
100+
101+
def _parse_tags(config):
102+
"""Parse out appropriate attributes and formats as tags.
103+
104+
Args:
105+
config: project config dict
106+
107+
Returns:
108+
List of tags
109+
"""
110+
try:
111+
return [
112+
{"Key": "sagemaker:project-id", "Value": config["sagemakerProjectId"]},
113+
{"Key": "sagemaker:project-name", "Value": config["sagemakerProjectName"]},
114+
]
115+
except Exception as e: # pylint: disable=W0703
116+
logger.debug("Could not parse project config. %s", e)

legacy/src/sagemaker/accept_types.py

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License"). You
4+
# may not use this file except in compliance with the License. A copy of
5+
# the License is located at
6+
#
7+
# http://aws.amazon.com/apache2.0/
8+
#
9+
# or in the "license" file accompanying this file. This file is
10+
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
11+
# ANY KIND, either express or implied. See the License for the specific
12+
# language governing permissions and limitations under the License.
13+
"""This module is for SageMaker accept types."""
14+
from __future__ import absolute_import
15+
from typing import List, Optional
16+
17+
from sagemaker.jumpstart import artifacts, utils as jumpstart_utils
18+
from sagemaker.jumpstart.constants import DEFAULT_JUMPSTART_SAGEMAKER_SESSION
19+
from sagemaker.jumpstart.enums import JumpStartModelType
20+
from sagemaker.session import Session
21+
22+
23+
def retrieve_options(
24+
region: Optional[str] = None,
25+
model_id: Optional[str] = None,
26+
model_version: Optional[str] = None,
27+
hub_arn: Optional[str] = None,
28+
tolerate_vulnerable_model: bool = False,
29+
tolerate_deprecated_model: bool = False,
30+
sagemaker_session: Session = DEFAULT_JUMPSTART_SAGEMAKER_SESSION,
31+
) -> List[str]:
32+
"""Retrieves the supported accept types for the model matching the given arguments.
33+
34+
Args:
35+
region (str): The AWS Region for which to retrieve the supported accept types.
36+
Defaults to ``None``.
37+
model_id (str): The model ID of the model for which to
38+
retrieve the supported accept types. (Default: None).
39+
model_version (str): The version of the model for which to retrieve the
40+
supported accept types. (Default: None).
41+
hub_arn (str): The arn of the SageMaker Hub for which to retrieve
42+
model details from. (Default: None).
43+
tolerate_vulnerable_model (bool): True if vulnerable versions of model
44+
specifications should be tolerated (exception not raised). If False, raises an
45+
exception if the script used by this version of the model has dependencies with known
46+
security vulnerabilities. (Default: False).
47+
tolerate_deprecated_model (bool): True if deprecated models should be tolerated
48+
(exception not raised). False if these models should raise an exception.
49+
(Default: False).
50+
sagemaker_session (sagemaker.session.Session): A SageMaker Session
51+
object, used for SageMaker interactions. If not
52+
specified, one is created using the default AWS configuration
53+
chain. (Default: sagemaker.jumpstart.constants.DEFAULT_JUMPSTART_SAGEMAKER_SESSION).
54+
Returns:
55+
list: The supported accept types to use for the model.
56+
57+
Raises:
58+
ValueError: If the combination of arguments specified is not supported.
59+
"""
60+
if not jumpstart_utils.is_jumpstart_model_input(model_id, model_version):
61+
raise ValueError(
62+
"Must specify JumpStart `model_id` and `model_version` when retrieving accept types."
63+
)
64+
65+
return artifacts._retrieve_supported_accept_types(
66+
model_id=model_id,
67+
model_version=model_version,
68+
hub_arn=hub_arn,
69+
region=region,
70+
tolerate_vulnerable_model=tolerate_vulnerable_model,
71+
tolerate_deprecated_model=tolerate_deprecated_model,
72+
sagemaker_session=sagemaker_session,
73+
)
74+
75+
76+
def retrieve_default(
77+
region: Optional[str] = None,
78+
model_id: Optional[str] = None,
79+
model_version: Optional[str] = None,
80+
hub_arn: Optional[str] = None,
81+
tolerate_vulnerable_model: bool = False,
82+
tolerate_deprecated_model: bool = False,
83+
sagemaker_session: Session = DEFAULT_JUMPSTART_SAGEMAKER_SESSION,
84+
model_type: JumpStartModelType = JumpStartModelType.OPEN_WEIGHTS,
85+
config_name: Optional[str] = None,
86+
) -> str:
87+
"""Retrieves the default accept type for the model matching the given arguments.
88+
89+
Args:
90+
region (str): The AWS Region for which to retrieve the default accept type.
91+
Defaults to ``None``.
92+
model_id (str): The model ID of the model for which to
93+
retrieve the default accept type. (Default: None).
94+
model_version (str): The version of the model for which to retrieve the
95+
default accept type. (Default: None).
96+
hub_arn (str): The arn of the SageMaker Hub for which to retrieve
97+
model details from. (Default: None).
98+
tolerate_vulnerable_model (bool): True if vulnerable versions of model
99+
specifications should be tolerated (exception not raised). If False, raises an
100+
exception if the script used by this version of the model has dependencies with known
101+
security vulnerabilities. (Default: False).
102+
tolerate_deprecated_model (bool): True if deprecated models should be tolerated
103+
(exception not raised). False if these models should raise an exception.
104+
(Default: False).
105+
sagemaker_session (sagemaker.session.Session): A SageMaker Session
106+
object, used for SageMaker interactions. If not
107+
specified, one is created using the default AWS configuration
108+
chain. (Default: sagemaker.jumpstart.constants.DEFAULT_JUMPSTART_SAGEMAKER_SESSION).
109+
config_name (Optional[str]): Name of the JumpStart Model config to apply. (Default: None).
110+
Returns:
111+
str: The default accept type to use for the model.
112+
113+
Raises:
114+
ValueError: If the combination of arguments specified is not supported.
115+
"""
116+
if not jumpstart_utils.is_jumpstart_model_input(model_id, model_version):
117+
raise ValueError(
118+
"Must specify JumpStart `model_id` and `model_version` when retrieving accept types."
119+
)
120+
121+
return artifacts._retrieve_default_accept_type(
122+
model_id=model_id,
123+
model_version=model_version,
124+
hub_arn=hub_arn,
125+
region=region,
126+
tolerate_vulnerable_model=tolerate_vulnerable_model,
127+
tolerate_deprecated_model=tolerate_deprecated_model,
128+
sagemaker_session=sagemaker_session,
129+
model_type=model_type,
130+
config_name=config_name,
131+
)

0 commit comments

Comments
 (0)