awslabs · siddharthgee · Sep 8, 2020
diff --git a/examples/tf_vision/README.md b/examples/tf_vision/README.md
@@ -0,0 +1,111 @@
+# Tensor Flow Saved Model Inference Service
+
+In this example, we show how to use a pre-trained Tensorflow MobileNet V2 model in the saved model format for performing real time inference using MMS
+
+# Objective
+
+1. Demonstrate how to package a a pre-trained TensorFlow saved model in MMS
+2. Demonstrate how to create custom service with pre-processing and post-processing
+
+# Pre-requisite
+Install tensorflow
+
+```
+pip install tensorflow==1.15
+```
+
+## Step 1 - Download the pre-trained MobileNet V2 Model
+
+You will need the model files to use for the export. Check this example's directory in case they're already downloaded. Otherwise, you can `curl` the files or download them via your browser:
+
+```bash
+cd multi-model-server/examples/tf_vision
+
+curl -O http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz
+tar -xvf ssd_mobilenet_v1_coco_2017_11_17.tar.gz
+cp ssd_mobilenet_v1_coco_2017_11_17/saved_model/saved_model.pb .
+```
+
+
+## Step 2 - Prepare the signature file
+
+Define model input name and shape in `signature.json` file. The signature for this example looks like below:
+
+```json
+{
+  "inputs": [
+    {
+      "data_precision": "UINT8",
+      "data_name": "inputs",
+      "data_shape": [
+        1,
+        224,
+        224,
+        3
+      ]
+    }
+  ]
+}
+```
+
+## Step 3 - Create custom service class
+
+We provid a custom service class template code in this folder:
+1. [model_handler.py](./model_handler.py) - A generic based service class.
+2. [tensorflow_saved_model_service.py](./tensorflow_saved_model_service.py) - A Tensorflow saved model base service class.
+3. [tensorflow_vision_service.py](./tensorflow_vision_service.py) - A Tensorflow Vision service class.
+4. [image.py](./image.py) - Utils for reshaping
+
+In this example, you can simple use the provided tensorflow_vision_service.py as user model archive entry point.
+
+## Step 4 - Package the model with `model-archiver` CLI utility
+
+In this step, we package the following:
+1. pre-trained TensorFlow Saved Model we downloaded in Step 1.
+2. signature.json file we prepared in step 2.
+3. custom model service files we mentioned in step 3.
+
+We use `model-archiver` command line utility (CLI) provided by MMS.
+Install `model-archiver` in case you have not:
+
+```bash
+pip install model-archiver
+```
+
+This tool create a .mar file that will be provided to MMS for serving inference requests. In following command line, we specify 'tensorflow_vision_service:handle' as model archive entry point.
+
+```bash
+cd multi-model-server/examples
+model-archiver --model-name mobilenetv2 --model-path tf_vision --handler tensorflow_vision_service:handle
+```
+
+## Step 5 - Start the Inference Service
+
+Start the inference service by providing the 'mobilenetv2.mar' file we created in Step 4.
+
+MMS then extracts the resources (signature, saved model) we have packaged into .mar file and uses the extended custom service, to start the inference server.
+
+By default, the server is started on the localhost at port 8080.
+
+```bash
+cd multi-model-server
+multi-model-server --start --model-store examples --models ssd=mobilenetv2.mar
+```
+
+Awesome! we have successfully exported a pre-trained TF saved model model, extended MMS with custom preprocess/postprocess and started a inference service.
+
+**Note**: In this example, MMS loads the .mar file from the local file system. However, you can also store the archive (.mar file) over a network-accessible storage such as AWS S3, and use a URL such as http:// or https:// to indicate the model archive location. MMS is capable of loading the model archive over such URLs as well.
+
+## Step 6 - Test sample inference
+
+Let us try the inference server we just started. Open another terminal on the same host. Download a sample image, or try any jpeg.
+
+You can also use this image of three dogs on a beach.
+![3 dogs on beach](../../docs/images/3dogs.jpg)
+
+Use curl to make a prediction call by passing the downloaded image as input to the prediction request.
+
+```bash
+cd multi-model-server
+curl -X POST http://127.0.0.1:8080/predictions/ssd -T docs/images/3dogs.jpg
+```
diff --git a/examples/tf_vision/image.py b/examples/tf_vision/image.py
@@ -0,0 +1,72 @@
+# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+# Licensed under the Apache License, Version 2.0 (the "License").
+# You may not use this file except in compliance with the License.
+# A copy of the License is located at
+#     http://www.apache.org/licenses/LICENSE-2.0
+# or in the "license" file accompanying this file. This file is distributed
+# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
+# express or implied. See the License for the specific language governing
+# permissions and limitations under the License.
+
+"""
+Image utils
+"""
+from io import BytesIO
+import cv2
+
+import numpy as np
+from PIL import Image
+
+
+def transform_shape(img_arr, dim_order='NHWC'):
+    """
+    Rearrange image numpy array shape to 'NCHW' or 'NHWC' which
+    is valid for TF model input.
+    Input image array should have dim_order of 'HWC'.
+
+    :param img_arr: numpy array
+        Image in numpy format with shape (height, width, channel)
+    :param dim_order: str
+        Output image dimension order. Valid values are 'NCHW' and 'NHWC'
+
+    :return: numpy array
+        Image in numpy array format with dim_order shape
+    """
+    assert dim_order in 'NCHW' or dim_order in 'NHWC', "dim_order must be 'NCHW' or 'NHWC'."
+    if dim_order == 'NCHW':
+        img_arr = np.transpose(img_arr, (2, 0, 1))
+    output = np.expand_dims(img_arr, axis=0)
+    return output
+
+
+def read(buf):
+    """
+    Read and decode an image to a numpy array.
+    Input image numpy should have dim_order of 'HWC'.
+
+    :param buf: image bytes
+        Binary image data as bytes.
+    :return: numpy array
+        A numpy array containing the image.
+    """
+    return np.array(Image.open(BytesIO(buf)))
+
+
+def resize(src, new_width, new_height, interp=2):
+    """
+    Resizes image to new_width and new_height.
+    Input image numpy array should have dim_order of 'HWC'.
+
+    :param src: numpy array
+        Source image in numpy array format
+    :param new_width: int
+        Width in pixel for resized image
+    :param new_height: int
+        Height in pixel for resized image
+    :param interp: int
+        interpolation method for all resizing operations
+
+    :return: numpy array
+        An numpy array containing the resized image.
+    """
+    return cv2.resize(src, dsize=(new_height, new_width), interpolation=interp)
diff --git a/examples/tf_vision/model_handler.py b/examples/tf_vision/model_handler.py
@@ -0,0 +1,97 @@
+# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+# Licensed under the Apache License, Version 2.0 (the "License").
+# You may not use this file except in compliance with the License.
+# A copy of the License is located at
+#     http://www.apache.org/licenses/LICENSE-2.0
+# or in the "license" file accompanying this file. This file is distributed
+# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
+# express or implied. See the License for the specific language governing
+# permissions and limitations under the License.
+
+"""
+ModelHandler defines a base model handler.
+"""
+import logging
+import time
+
+
+class ModelHandler(object):
+    """
+    A base Model handler implementation.
+    """
+
+    def __init__(self):
+        self.error = None
+        self._context = None
+        self._batch_size = 0
+        self.initialized = False
+
+    def initialize(self, context):
+        """
+        Initialize model. This will be called during model loading time
+
+        :param context: Initial context contains model server system properties.
+        :return:
+        """
+        self._context = context
+        self._batch_size = context.system_properties["batch_size"]
+        self.initialized = True
+
+    def preprocess(self, batch):
+        """
+        Transform raw input into model input data.
+
+        :param batch: list of raw requests, should match batch size
+        :return: list of preprocessed model input data
+        """
+        assert self._batch_size == len(batch), "Invalid input batch size: {}".format(len(batch))
+        return None
+
+    def inference(self, model_input):
+        """
+        Internal inference methods
+
+        :param model_input: transformed model input data
+        :return: list of inference output in NDArray
+        """
+        return None
+
+    def postprocess(self, inference_output):
+        """
+        Return predict result in batch.
+
+        :param inference_output: list of inference output
+        :return: list of predict results
+        """
+        return ["OK"] * self._batch_size
+
+    def handle(self, data, context):
+        """
+        Custom service entry point function.
+
+        :param data: list of objects, raw input from request
+        :param context: model server context
+        :return: list of outputs to be send back to client
+        """
+        self.error = None  # reset earlier errors
+
+        try:
+            preprocess_start = time.time()
+            data = self.preprocess(data)
+            inference_start = time.time()
+            data = self.inference(data)
+            postprocess_start = time.time()
+            data = self.postprocess(data)
+            end_time = time.time()
+
+            metrics = context.metrics
+            metrics.add_time("PreprocessTime", round((inference_start - preprocess_start) * 1000, 2))
+            metrics.add_time("InferenceTime", round((postprocess_start - inference_start) * 1000, 2))
+            metrics.add_time("PostprocessTime", round((end_time - postprocess_start) * 1000, 2))
+
+            return data
+        except Exception as e:
+            logging.error(e, exc_info=True)
+            request_processor = context.request_processor
+            request_processor.report_status(500, "Unknown inference error")
+            return [str(e)] * self._batch_size