Update default versions to TensorFlow 1.6 and MXNet 1.1 (#118)

winstonaws · web-flow · commit 2d1d5cf5f0dd · 2018-04-01T22:45:48.000-07:00
* Update default TF and MXNet versions

* Update changelog

* Bump version to 1.2.0 due to TF / MXNet version changes

* Reorder changelog entries

* Update version info in README
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -2,10 +2,13 @@
 CHANGELOG
 =========
 
-1.1.dev4
+1.2.0
 ========
-* feature: Frameworks: Use more idiomatic ECR repository naming scheme
+
 * feature: Add Support for Local Mode
+* feature: Estimators: add support for TensorFlow 1.6.0
+* feature: Estimators: add support for MXNet 1.1.0
+* feature: Frameworks: Use more idiomatic ECR repository naming scheme
 
 1.1.3
 ========
diff --git a/README.rst b/README.rst
@@ -692,23 +692,23 @@ When training and deploying training scripts, SageMaker runs your Python script
 
 SageMaker runs MXNet Estimator scripts in either Python 2.7 or Python 3.5. You can select the Python version by passing a ``py_version`` keyword arg to the MXNet Estimator constructor. Setting this to ``py2`` (the default) will cause your training script to be run on Python 2.7. Setting this to ``py3`` will cause your training script to be run on Python 3.5. This Python version applies to both the Training Job, created by fit, and the Endpoint, created by deploy.
 
-Your MXNet training script will be run on version 1.0.0 (by default) or 0.12 of MXNet, built for either GPU or CPU use. The decision to use the GPU or CPU version of MXNet is made by the ``train_instance_type``, set on the MXNet constructor. If you choose a GPU instance type, your training job will be run on a GPU version of MXNet. If you choose a CPU instance type, your training job will be run on a CPU version of MXNet. Similarly, when you call deploy, specifying a GPU or CPU deploy_instance_type, will control which MXNet build your Endpoint runs.
+Your MXNet training script will be run on version 1.1.0 by default. (See below for how to choose a different version, and currently supported versions.) The decision to use the GPU or CPU version of MXNet is made by the ``train_instance_type``, set on the MXNet constructor. If you choose a GPU instance type, your training job will be run on a GPU version of MXNet. If you choose a CPU instance type, your training job will be run on a CPU version of MXNet. Similarly, when you call deploy, specifying a GPU or CPU deploy_instance_type, will control which MXNet build your Endpoint runs.
 
 The Docker images have the following dependencies installed:
 
-+-------------------------+--------------+-------------+
-| Dependencies            | MXNet 0.12.1 | MXNet 1.0.0 |
-+-------------------------+--------------+-------------+
-| Python                  |   2.7 or 3.5 |   2.7 or 3.5|
-+-------------------------+--------------+-------------+
-| CUDA                    |          9.0 |         9.0 |
-+-------------------------+--------------+-------------+
-| numpy                   |       1.13.3 |      1.13.3 |
-+-------------------------+--------------+-------------+
++-------------------------+--------------+-------------+-------------+
+| Dependencies            | MXNet 0.12.1 | MXNet 1.0.0 | MXNet 1.1.0 |
++-------------------------+--------------+-------------+-------------+
+| Python                  |   2.7 or 3.5 |   2.7 or 3.5|   2.7 or 3.5|
++-------------------------+--------------+-------------+-------------+
+| CUDA                    |          9.0 |         9.0 |         9.0 |
++-------------------------+--------------+-------------+-------------+
+| numpy                   |       1.13.3 |      1.13.3 |      1.13.3 |
++-------------------------+--------------+-------------+-------------+
 
 The Docker images extend Ubuntu 16.04.
 
-You can select version of MXNet by passing a ``framework_version`` keyword arg to the MXNet Estimator constructor. Currently supported versions are ``1.0.0`` and ``0.12.1``. You can also set ``framework_version`` to ``1.0 (default)`` or ``0.12`` which will cause your training script to be run on the latest supported MXNet 1.0 or 0.12 versions respectively.
+You can select version of MXNet by passing a ``framework_version`` keyword arg to the MXNet Estimator constructor. Currently supported versions are listed in the above table. You can also set ``framework_version`` to only specify major and minor version, e.g ``1.1``, which will cause your training script to be run on the latest supported patch version of that minor version, which in this example would be 1.1.0.
 
 TensorFlow SageMaker Estimators
 -------------------------------
@@ -717,7 +717,7 @@ TensorFlow SageMaker Estimators allow you to run your own TensorFlow
 training algorithms on SageMaker Learner, and to host your own TensorFlow
 models on SageMaker Hosting.
 
-Supported versions of TensorFlow: ``1.4.1``, ``1.5.0``.
+Supported versions of TensorFlow: ``1.4.1``, ``1.5.0``, ``1.6.0``.
 
 Training with TensorFlow
 ~~~~~~~~~~~~~~~~~~~~~~~~
@@ -752,7 +752,7 @@ Preparing the TensorFlow training script
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 Your TensorFlow training script must be a **Python 2.7** source file. The current supported TensorFlow
-versions are **1.5.0 (default)** and **1.4.1**. This training script **must contain** the following functions:
+versions are **1.6.0 (default)**, **1.5.0**, and **1.4.1**. This training script **must contain** the following functions:
 
 - ``model_fn``: defines the model that will be trained.
 - ``train_input_fn``: preprocess and load training data.
@@ -1443,47 +1443,35 @@ SageMaker TensorFlow Docker containers
 
 The TensorFlow Docker images support Python 2.7 and have the following Python modules installed:
 
-+------------------------+------------------+------------------+
-| Dependencies           | tensorflow 1.4.1 | tensorflow 1.5.0 |
-+------------------------+------------------+------------------+
-| awscli                 |           1.12.1 |          1.14.35 |
-+------------------------+------------------+------------------+
-| boto3                  |            1.4.7 |           1.5.22 |
-+------------------------+------------------+------------------+
-| botocore               |           1.5.92 |           1.8.36 |
-+------------------------+------------------+------------------+
-| futures                |            2.2.0 |            2.2.0 |
-+------------------------+------------------+------------------+
-| gevent                 |            1.2.2 |            1.2.2 |
-+------------------------+------------------+------------------+
-| grpcio                 |            1.7.0 |            1.9.0 |
-+------------------------+------------------+------------------+
-| numpy                  |           1.13.3 |           1.14.0 |
-+------------------------+------------------+------------------+
-| pandas                 |           0.21.0 |           0.22.0 |
-+------------------------+------------------+------------------+
-| protobuf               |            3.4.0 |            3.5.1 |
-+------------------------+------------------+------------------+
-| requests               |           2.14.2 |           2.18.4 |
-+------------------------+------------------+------------------+
-| scikit-learn           |           0.19.1 |           0.19.1 |
-+------------------------+------------------+------------------+
-| scipy                  |            1.0.0 |            1.0.0 |
-+------------------------+------------------+------------------+
-| six                    |           1.10.0 |           1.10.0 |
-+------------------------+------------------+------------------+
-| sklearn                |              0.0 |              0.0 |
-+------------------------+------------------+------------------+
-| tensorflow             |            1.4.1 |            1.5.0 |
-+------------------------+------------------+------------------+
-| tensorflow-serving-api |            1.4.0 |            1.5.0 |
-+------------------------+------------------+------------------+
-| tensorflow-tensorboard |            0.4.0 |            1.5.1 |
-+------------------------+------------------+------------------+
++------------------------+------------------+------------------+------------------+
+| Dependencies           | tensorflow 1.4.1 | tensorflow 1.5.0 | tensorflow 1.6.0 |
++------------------------+------------------+------------------+------------------+
+| boto3                  |            1.4.7 |           1.5.22 |          1.6.21  |
++------------------------+------------------+------------------+------------------+
+| botocore               |           1.5.92 |           1.8.36 |          1.9.21  |
++------------------------+------------------+------------------+------------------+
+| grpcio                 |            1.7.0 |            1.9.0 |          1.10.0  |
++------------------------+------------------+------------------+------------------+
+| numpy                  |           1.13.3 |           1.14.0 |          1.14.2  |
++------------------------+------------------+------------------+------------------+
+| pandas                 |           0.21.0 |           0.22.0 |          0.22.0  |
++------------------------+------------------+------------------+------------------+
+| protobuf               |            3.4.0 |            3.5.1 |          3.5.2   |
++------------------------+------------------+------------------+------------------+
+| scikit-learn           |           0.19.1 |           0.19.1 |          0.19.1  |
++------------------------+------------------+------------------+------------------+
+| scipy                  |            1.0.0 |            1.0.0 |          1.0.1   |
++------------------------+------------------+------------------+------------------+
+| sklearn                |              0.0 |              0.0 |          0.0     |
++------------------------+------------------+------------------+------------------+
+| tensorflow             |            1.4.1 |            1.5.0 |          1.6.0   |
++------------------------+------------------+------------------+------------------+
+| tensorflow-serving-api |            1.4.0 |            1.5.0 |          1.5.0   |
++------------------------+------------------+------------------+------------------+
 
 The Docker images extend Ubuntu 16.04.
 
-You can select version of TensorFlow by passing a ``framework_version`` keyword arg to the TensorFlow Estimator constructor. Currently supported versions are ``1.5.0`` and ``1.4.1``. You can also set ``framework_version`` to ``1.5 (default)`` or ``1.4`` which will cause your training script to be run on the latest supported TensorFlow 1.5 or 1.4 versions respectively.
+You can select version of TensorFlow by passing a ``framework_version`` keyword arg to the TensorFlow Estimator constructor. Currently supported versions are listed in the table above. You can also set ``framework_version`` to only specify major and minor version, e.g ``1.6``, which will cause your training script to be run on the latest supported patch version of that minor version, which in this example would be 1.6.0.
 
 AWS SageMaker Estimators
 ------------------------
diff --git a/setup.py b/setup.py
@@ -11,7 +11,7 @@ def read(fname):
 
 
 setup(name="sagemaker",
-      version="1.1.3",
+      version="1.2.0",
       description="Open source library for training and deploying models on Amazon SageMaker.",
       packages=find_packages('src'),
       package_dir={'': 'src'},
diff --git a/src/sagemaker/mxnet/defaults.py b/src/sagemaker/mxnet/defaults.py
@@ -10,4 +10,4 @@
 # distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
 # ANY KIND, either express or implied. See the License for the specific
 # language governing permissions and limitations under the License.
-MXNET_VERSION = '1.0'
+MXNET_VERSION = '1.1'
diff --git a/src/sagemaker/tensorflow/defaults.py b/src/sagemaker/tensorflow/defaults.py
@@ -10,4 +10,4 @@
 # distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
 # ANY KIND, either express or implied. See the License for the specific
 # language governing permissions and limitations under the License.
-TF_VERSION = '1.5'
+TF_VERSION = '1.6'
diff --git a/tests/conftest.py b/tests/conftest.py
@@ -56,21 +56,21 @@ def sagemaker_session(sagemaker_client_config, sagemaker_runtime_config, boto_co
                    sagemaker_runtime_client=runtime_client)
 
 
-@pytest.fixture(scope='module', params=["1.4", "1.4.1", "1.5", "1.5.0"])
+@pytest.fixture(scope='module', params=['1.4', '1.4.1', '1.5', '1.5.0', '1.6', '1.6.0'])
 def tf_version(request):
     return request.param
 
 
-@pytest.fixture(scope='module', params=["0.12", "0.12.1", "1.0", "1.0.0"])
+@pytest.fixture(scope='module', params=['0.12', '0.12.1', '1.0', '1.0.0', '1.1', '1.1.0'])
 def mxnet_version(request):
     return request.param
 
 
-@pytest.fixture(scope='module', params=["1.4.1", "1.5.0"])
+@pytest.fixture(scope='module', params=['1.4.1', '1.5.0', '1.6.0'])
 def tf_full_version(request):
     return request.param
 
 
-@pytest.fixture(scope='module', params=["0.12.1", "1.0.0"])
+@pytest.fixture(scope='module', params=['0.12.1', '1.0.0', '1.1.0'])
 def mxnet_full_version(request):
     return request.param
diff --git a/tests/integ/test_mxnet_train.py b/tests/integ/test_mxnet_train.py
@@ -52,16 +52,28 @@ def test_attach_deploy(mxnet_training_job, sagemaker_session):
         predictor.predict(data)
 
 
-def test_async_fit(sagemaker_session, mxnet_full_version):
+def test_deploy_model(mxnet_training_job, sagemaker_session):
+    endpoint_name = 'test-mxnet-deploy-model-{}'.format(sagemaker_timestamp())
+
+    with timeout_and_delete_endpoint_by_name(endpoint_name, sagemaker_session, minutes=20):
+        desc = sagemaker_session.sagemaker_client.describe_training_job(TrainingJobName=mxnet_training_job)
+        model_data = desc['ModelArtifacts']['S3ModelArtifacts']
+        script_path = os.path.join(DATA_DIR, 'mxnet_mnist', 'mnist.py')
+        model = MXNetModel(model_data, 'SageMakerRole', entry_point=script_path, sagemaker_session=sagemaker_session)
+        predictor = model.deploy(1, 'ml.m4.xlarge', endpoint_name=endpoint_name)
 
-    training_job_name = ""
+        data = numpy.zeros(shape=(1, 1, 28, 28))
+        predictor.predict(data)
+
+
+def test_async_fit(sagemaker_session):
     endpoint_name = 'test-mxnet-attach-deploy-{}'.format(sagemaker_timestamp())
 
     with timeout(minutes=5):
         script_path = os.path.join(DATA_DIR, 'mxnet_mnist', 'mnist.py')
         data_path = os.path.join(DATA_DIR, 'mxnet_mnist')
 
-        mx = MXNet(entry_point=script_path, role='SageMakerRole', framework_version=mxnet_full_version,
+        mx = MXNet(entry_point=script_path, role='SageMakerRole',
                    train_instance_count=1, train_instance_type='ml.c4.xlarge',
                    sagemaker_session=sagemaker_session)
 
@@ -84,20 +96,6 @@ def test_async_fit(sagemaker_session, mxnet_full_version):
         predictor.predict(data)
 
 
-def test_deploy_model(mxnet_training_job, sagemaker_session):
-    endpoint_name = 'test-mxnet-deploy-model-{}'.format(sagemaker_timestamp())
-
-    with timeout_and_delete_endpoint_by_name(endpoint_name, sagemaker_session, minutes=20):
-        desc = sagemaker_session.sagemaker_client.describe_training_job(TrainingJobName=mxnet_training_job)
-        model_data = desc['ModelArtifacts']['S3ModelArtifacts']
-        script_path = os.path.join(DATA_DIR, 'mxnet_mnist', 'mnist.py')
-        model = MXNetModel(model_data, 'SageMakerRole', entry_point=script_path, sagemaker_session=sagemaker_session)
-        predictor = model.deploy(1, 'ml.m4.xlarge', endpoint_name=endpoint_name)
-
-        data = numpy.zeros(shape=(1, 1, 28, 28))
-        predictor.predict(data)
-
-
 def test_failed_training_job(sagemaker_session, mxnet_full_version):
     with timeout(minutes=15):
         script_path = os.path.join(DATA_DIR, 'mxnet_mnist', 'failure_script.py')
diff --git a/tests/integ/test_tf.py b/tests/integ/test_tf.py
@@ -55,14 +55,12 @@ def test_tf(sagemaker_session, tf_full_version):
         assert dict_result == list_result
 
 
-def test_tf_async(sagemaker_session, tf_full_version):
-    training_job_name = ""
+def test_tf_async(sagemaker_session):
     with timeout(minutes=5):
         script_path = os.path.join(DATA_DIR, 'iris', 'iris-dnn-classifier.py')
 
         estimator = TensorFlow(entry_point=script_path,
                                role='SageMakerRole',
-                               framework_version=tf_full_version,
                                training_steps=1,
                                evaluation_steps=1,
                                hyperparameters={'input_tensor_name': 'inputs'},