Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions examples/tf_vision/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Tensor Flow Saved Model Inference Service

In this example, we show how to use a pre-trained Tensorflow MobileNet V2 model in the saved model format for performing real time inference using MMS

# Objective

1. Demonstrate how to package a a pre-trained TensorFlow saved model in MMS
2. Demonstrate how to create custom service with pre-processing and post-processing

# Pre-requisite
Install tensorflow

```
pip install tensorflow==1.15
```

## Step 1 - Download the pre-trained MobileNet V2 Model

You will need the model files to use for the export. Check this example's directory in case they're already downloaded. Otherwise, you can `curl` the files or download them via your browser:

```bash
cd multi-model-server/examples/tf_vision

curl -O http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz
tar -xvf ssd_mobilenet_v1_coco_2017_11_17.tar.gz
cp ssd_mobilenet_v1_coco_2017_11_17/saved_model/saved_model.pb .
```


## Step 2 - Prepare the signature file

Define model input name and shape in `signature.json` file. The signature for this example looks like below:

```json
{
"inputs": [
{
"data_precision": "UINT8",
"data_name": "inputs",
"data_shape": [
1,
224,
224,
3
]
}
]
}
```

## Step 3 - Create custom service class

We provid a custom service class template code in this folder:
1. [model_handler.py](./model_handler.py) - A generic based service class.
2. [tensorflow_saved_model_service.py](./tensorflow_saved_model_service.py) - A Tensorflow saved model base service class.
3. [tensorflow_vision_service.py](./tensorflow_vision_service.py) - A Tensorflow Vision service class.
4. [image.py](./image.py) - Utils for reshaping

In this example, you can simple use the provided tensorflow_vision_service.py as user model archive entry point.

## Step 4 - Package the model with `model-archiver` CLI utility

In this step, we package the following:
1. pre-trained TensorFlow Saved Model we downloaded in Step 1.
2. signature.json file we prepared in step 2.
3. custom model service files we mentioned in step 3.

We use `model-archiver` command line utility (CLI) provided by MMS.
Install `model-archiver` in case you have not:

```bash
pip install model-archiver
```

This tool create a .mar file that will be provided to MMS for serving inference requests. In following command line, we specify 'tensorflow_vision_service:handle' as model archive entry point.

```bash
cd multi-model-server/examples
model-archiver --model-name mobilenetv2 --model-path tf_vision --handler tensorflow_vision_service:handle
```

## Step 5 - Start the Inference Service

Start the inference service by providing the 'mobilenetv2.mar' file we created in Step 4.

MMS then extracts the resources (signature, saved model) we have packaged into .mar file and uses the extended custom service, to start the inference server.

By default, the server is started on the localhost at port 8080.

```bash
cd multi-model-server
multi-model-server --start --model-store examples --models ssd=mobilenetv2.mar
```

Awesome! we have successfully exported a pre-trained TF saved model model, extended MMS with custom preprocess/postprocess and started a inference service.

**Note**: In this example, MMS loads the .mar file from the local file system. However, you can also store the archive (.mar file) over a network-accessible storage such as AWS S3, and use a URL such as http:// or https:// to indicate the model archive location. MMS is capable of loading the model archive over such URLs as well.

## Step 6 - Test sample inference

Let us try the inference server we just started. Open another terminal on the same host. Download a sample image, or try any jpeg.

You can also use this image of three dogs on a beach.
![3 dogs on beach](../../docs/images/3dogs.jpg)

Use curl to make a prediction call by passing the downloaded image as input to the prediction request.

```bash
cd multi-model-server
curl -X POST http://127.0.0.1:8080/predictions/ssd -T docs/images/3dogs.jpg
```
72 changes: 72 additions & 0 deletions examples/tf_vision/image.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
# http://www.apache.org/licenses/LICENSE-2.0
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

"""
Image utils
"""
from io import BytesIO
import cv2

import numpy as np
from PIL import Image


def transform_shape(img_arr, dim_order='NHWC'):
"""
Rearrange image numpy array shape to 'NCHW' or 'NHWC' which
is valid for TF model input.
Input image array should have dim_order of 'HWC'.

:param img_arr: numpy array
Image in numpy format with shape (height, width, channel)
:param dim_order: str
Output image dimension order. Valid values are 'NCHW' and 'NHWC'

:return: numpy array
Image in numpy array format with dim_order shape
"""
assert dim_order in 'NCHW' or dim_order in 'NHWC', "dim_order must be 'NCHW' or 'NHWC'."
if dim_order == 'NCHW':
img_arr = np.transpose(img_arr, (2, 0, 1))
output = np.expand_dims(img_arr, axis=0)
return output


def read(buf):
"""
Read and decode an image to a numpy array.
Input image numpy should have dim_order of 'HWC'.

:param buf: image bytes
Binary image data as bytes.
:return: numpy array
A numpy array containing the image.
"""
return np.array(Image.open(BytesIO(buf)))


def resize(src, new_width, new_height, interp=2):
"""
Resizes image to new_width and new_height.
Input image numpy array should have dim_order of 'HWC'.

:param src: numpy array
Source image in numpy array format
:param new_width: int
Width in pixel for resized image
:param new_height: int
Height in pixel for resized image
:param interp: int
interpolation method for all resizing operations

:return: numpy array
An numpy array containing the resized image.
"""
return cv2.resize(src, dsize=(new_height, new_width), interpolation=interp)
97 changes: 97 additions & 0 deletions examples/tf_vision/model_handler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
# http://www.apache.org/licenses/LICENSE-2.0
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

"""
ModelHandler defines a base model handler.
"""
import logging
import time


class ModelHandler(object):
"""
A base Model handler implementation.
"""

def __init__(self):
self.error = None
self._context = None
self._batch_size = 0
self.initialized = False

def initialize(self, context):
"""
Initialize model. This will be called during model loading time

:param context: Initial context contains model server system properties.
:return:
"""
self._context = context
self._batch_size = context.system_properties["batch_size"]
self.initialized = True

def preprocess(self, batch):
"""
Transform raw input into model input data.

:param batch: list of raw requests, should match batch size
:return: list of preprocessed model input data
"""
assert self._batch_size == len(batch), "Invalid input batch size: {}".format(len(batch))
return None

def inference(self, model_input):
"""
Internal inference methods

:param model_input: transformed model input data
:return: list of inference output in NDArray
"""
return None

def postprocess(self, inference_output):
"""
Return predict result in batch.

:param inference_output: list of inference output
:return: list of predict results
"""
return ["OK"] * self._batch_size

def handle(self, data, context):
"""
Custom service entry point function.

:param data: list of objects, raw input from request
:param context: model server context
:return: list of outputs to be send back to client
"""
self.error = None # reset earlier errors

try:
preprocess_start = time.time()
data = self.preprocess(data)
inference_start = time.time()
data = self.inference(data)
postprocess_start = time.time()
data = self.postprocess(data)
end_time = time.time()

metrics = context.metrics
metrics.add_time("PreprocessTime", round((inference_start - preprocess_start) * 1000, 2))
metrics.add_time("InferenceTime", round((postprocess_start - inference_start) * 1000, 2))
metrics.add_time("PostprocessTime", round((end_time - postprocess_start) * 1000, 2))

return data
except Exception as e:
logging.error(e, exc_info=True)
request_processor = context.request_processor
request_processor.report_status(500, "Unknown inference error")
return [str(e)] * self._batch_size
Loading