Skip to content

Commit abcaf1a

Browse files
abishekchiffonshivamshriwasharshbafnadhaniram kshirsagardhaniram-kshirsagar
authored
Add request envelope to support Multiserving frameworks (#749)
* - updated image classifier default handler - updated custom resnet and mnist handlers - update docs * documentation update - removed stale Transformer_readme.md * updated docs * Doc restructure and code fixes * Updated example as per fix in default handlers * Enhanced base and custom handler examples * Missing checking for manifest * Fixed some typos * Removed commented code * Refactor BaseHandler * Adding in unit tests * Fixed gitignore in this branch * Fix a bug with Image Segmenter * Updated Object Detector to reuse functionality; consistency * Fix pylint errors * Backwards compat for index_names.json * Fixed Image Segmenter again * Made the compat layer in text actually compat. * Removed batching from text classifier * Adding comments per review. * Fixing doc feedback. * Updating docs about batching. * Initial commit of envelopes. * Got the end-to-end working * Undoing a change for local stuff. * Fixing a few broken tests. * Fixed error introduced due to conflict resolution via web based merge tool * Corrected code comment * - Updated Object detection & text classification handlers - updated docs * fixed python linting errors * updated index to name json text classifier * fixed object detector handler for batch support * Fixed the batch inference output * update expected output as per new handler changes * updated text classification mar name in sanity suite * updated text classifier mar name and removed bert scripted models * updated model zoo with new text classification url * added model_name in while registering model in sanity suite * updated text classification model name * added upgrade option for installing python dependencies in install utils * added upgrade option for installing python dependencies and extra numpy package in regression suite * refectored pytests in regression suite for better performance and reporting * Merge upstream * Got the end-to-end working * Merge upstream (2) * Fixing a few broken tests. * Undoing a bad merge. * minor fix in torch-archiver command * reverted postprocess removal * updated mar files in model zoo to use updated handlers * updated regression suite to use updated mar files * suppressed pylint warning in UT * fixed resnet-152 mar name and expected output * updated inference tests data - added tolerence value for resent152 models * Added custom handler in vgg11 example (#559) * added custom handler for vgg11 * added readme for vgg11 example * fixed typo in readme * updated model zoo * reverted back changes for scripted vgg11 mar file * added vgg11 model to regression test suite * disabled pylint check in UT * updated expected response for vgg11 inference in regression suite * updated expected response for vgg11 inference in regression suite * updated expected for densenet scripted * Fixing bad file format * Fixed the 'no newman npm' issue in regression test suite, solution suggested in PR 757 * Fixed the metrics bug #772 for test_envelopes Co-authored-by: Shivam Shriwas <[email protected]> Co-authored-by: Harsh Bafna <[email protected]> Co-authored-by: dhaniram kshirsagar <[email protected]> Co-authored-by: dhaniram-kshirsagar <[email protected]> Co-authored-by: Henry Tappen <[email protected]> Co-authored-by: harshbafna <[email protected]> Co-authored-by: Aaqib <[email protected]> Co-authored-by: Henry Tappen <[email protected]> Co-authored-by: Geeta Chauhan <[email protected]>
1 parent 40f4258 commit abcaf1a

File tree

20 files changed

+351
-48
lines changed

20 files changed

+351
-48
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ dist/
44
*__pycache__*
55
*.egg-info/
66
.idea
7+
*htmlcov*
8+
.coverage
79
.github/actions/
810
.github/.DS_Store
9-
.DS_Store
11+
.DS_Store

docs/default_handlers.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,4 +46,7 @@ For more details see [examples](https://github.com/pytorch/serve/tree/master/exa
4646
- [object_detector](https://github.com/pytorch/serve/tree/master/examples/object_detector/index_to_name.json)
4747

4848
# Contributing
49-
If you'd like to edit or create a new default_handler class, make sure to run and update the unit tests in [unit_tests](https://github.com/pytorch/serve/tree/master/ts/torch_handler/unit_tests). As always, make sure to run [torchserve_sanity.py](https://github.com/pytorch/serve/tree/master/torchserve_sanity.py) before submitting.
49+
If you'd like to edit or create a new default_handler class, you need to take the following steps:
50+
1. Write a new class derived from BaseHandler. Add it as a separate file in `ts/torch_handler/`
51+
1. Update `model-archiver/model_packaging.py` to add in your classes name
52+
1. Run and update the unit tests in [unit_tests](https://github.com/pytorch/serve/tree/master/ts/torch_handler/unit_tests). As always, make sure to run [torchserve_sanity.py](https://github.com/pytorch/serve/tree/master/torchserve_sanity.py) before submitting.

docs/request_envelopes.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Introduction
2+
3+
Many model serving systems provide a signature for request bodies. Examples include:
4+
5+
- [Seldon](https://docs.seldon.io/projects/seldon-core/en/v1.1.0/graph/protocols.html)
6+
- [KFServing](https://github.com/kubeflow/kfserving/tree/master/docs)
7+
- [Google Cloud AI Platform](https://cloud.google.com/ai-platform/prediction/docs/online-predict)
8+
9+
Data scientists use these multi-framework systems to manage deployments of many different models, possibly written in different languages and frameworks. The platforms offer additional analytics on top of model serving, including skew detection, explanations and A/B testing. These platforms need a well-structured signature in order to both standardize calls across different frameworks and to understand the input data. To simplify support for many frameworks, though, these platforms will simply pass the request body along to the underlying model server.
10+
11+
Torchserve currently has no fixed request body signature. Envelopes allow you to automatically translate from the fixed signature required for your model orchestrator to a flat Python list.
12+
13+
# Usage
14+
1. When you write a handler, always expect a plain Python list containing data ready to go into `preprocess`. Crucially, you should assume that your handler code looks the same locally or in your model orchestrator.
15+
1. When you deploy Torchserve behind a model orchestrator, make sure to set the corresponding `service_envelope` in your `config.properties` file. For example, if you're using Google Cloud AI Platform, which has a JSON format, you'd add `service_envelope=json` to your `config.properties` file.
16+
17+
# Contributing
18+
Add new files under `ts/torch_handler/request_envelope`. Only include one class per file. The key used in `config.properties` will be the name of the .py file you write your class in.

frontend/modelarchive/src/main/java/org/pytorch/serve/archive/Manifest.java

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ public static final class Model {
6262
private String description;
6363
private String modelVersion;
6464
private String handler;
65+
private String envelope;
6566
private String requirementsFile;
6667

6768
public Model() {}
@@ -113,6 +114,14 @@ public String getHandler() {
113114
public void setHandler(String handler) {
114115
this.handler = handler;
115116
}
117+
118+
public String getEnvelope() {
119+
return envelope;
120+
}
121+
122+
public void setEnvelope(String envelope) {
123+
this.envelope = envelope;
124+
}
116125
}
117126

118127
public enum RuntimeType {

frontend/server/src/main/java/org/pytorch/serve/util/ConfigManager.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ public final class ConfigManager {
7474
private static final String TS_MAX_REQUEST_SIZE = "max_request_size";
7575
private static final String TS_MAX_RESPONSE_SIZE = "max_response_size";
7676
private static final String TS_DEFAULT_SERVICE_HANDLER = "default_service_handler";
77+
private static final String TS_SERVICE_ENVELOPE = "service_envelope";
7778
private static final String TS_MODEL_SERVER_HOME = "model_server_home";
7879
private static final String TS_MODEL_STORE = "model_store";
7980
private static final String TS_SNAPSHOT_STORE = "snapshot_store";
@@ -314,6 +315,10 @@ public String getTsDefaultServiceHandler() {
314315
return getProperty(TS_DEFAULT_SERVICE_HANDLER, null);
315316
}
316317

318+
public String getTsServiceEnvelope() {
319+
return getProperty(TS_SERVICE_ENVELOPE, null);
320+
}
321+
317322
public Properties getConfiguration() {
318323
return (Properties) prop.clone();
319324
}

frontend/server/src/main/java/org/pytorch/serve/util/codec/ModelRequestEncoder.java

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,10 +43,23 @@ protected void encode(ChannelHandlerContext ctx, BaseModelRequest msg, ByteBuf o
4343
buf = handler.getBytes(StandardCharsets.UTF_8);
4444
}
4545

46+
// TODO: this might be a bug. If handler isn't specified, this
47+
// will repeat the model path
4648
out.writeInt(buf.length);
4749
out.writeBytes(buf);
4850

4951
out.writeInt(request.getGpuId());
52+
53+
String envelope = request.getEnvelope();
54+
if (envelope != null) {
55+
buf = envelope.getBytes(StandardCharsets.UTF_8);
56+
} else {
57+
buf = new byte[0];
58+
}
59+
60+
out.writeInt(buf.length);
61+
out.writeBytes(buf);
62+
5063
} else if (msg instanceof ModelInferenceRequest) {
5164
out.writeByte('I');
5265
ModelInferenceRequest request = (ModelInferenceRequest) msg;

frontend/server/src/main/java/org/pytorch/serve/util/messages/ModelLoadModelRequest.java

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ public class ModelLoadModelRequest extends BaseModelRequest {
1111
private String modelPath;
1212

1313
private String handler;
14+
private String envelope;
1415
private int batchSize;
1516
private int gpuId;
1617

@@ -19,6 +20,7 @@ public ModelLoadModelRequest(Model model, int gpuId) {
1920
this.gpuId = gpuId;
2021
modelPath = model.getModelDir().getAbsolutePath();
2122
handler = model.getModelArchive().getManifest().getModel().getHandler();
23+
envelope = model.getModelArchive().getManifest().getModel().getEnvelope();
2224
batchSize = model.getBatchSize();
2325
}
2426

@@ -30,6 +32,10 @@ public String getHandler() {
3032
return handler;
3133
}
3234

35+
public String getEnvelope() {
36+
return envelope;
37+
}
38+
3339
public int getBatchSize() {
3440
return batchSize;
3541
}

frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,8 @@ private ModelArchive createModelArchive(
155155
archive.getManifest().getModel().setHandler(configManager.getTsDefaultServiceHandler());
156156
}
157157

158+
archive.getManifest().getModel().setEnvelope(configManager.getTsServiceEnvelope());
159+
158160
archive.validate();
159161

160162
return archive;

test/postman/increased_timeout_inference.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,4 @@
1616
},
1717
"tolerance":5
1818
}
19-
]
19+
]

ts/model_loader.py

Lines changed: 64 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ class ModelLoader(object):
3434
__metaclass__ = ABCMeta
3535

3636
@abstractmethod
37-
def load(self, model_name, model_dir, handler, gpu_id, batch_size):
37+
def load(self, model_name, model_dir, handler, gpu_id, batch_size, envelope=None):
3838
"""
3939
Load model from file.
4040
@@ -43,6 +43,7 @@ def load(self, model_name, model_dir, handler, gpu_id, batch_size):
4343
:param handler:
4444
:param gpu_id:
4545
:param batch_size:
46+
:param envelope:
4647
:return: Model
4748
"""
4849
# pylint: disable=unnecessary-pass
@@ -54,7 +55,7 @@ class TsModelLoader(ModelLoader):
5455
TorchServe 1.0 Model Loader
5556
"""
5657

57-
def load(self, model_name, model_dir, handler, gpu_id, batch_size):
58+
def load(self, model_name, model_dir, handler, gpu_id, batch_size, envelope=None):
5859
"""
5960
Load TorchServe 1.0 model from file.
6061
@@ -63,6 +64,7 @@ def load(self, model_name, model_dir, handler, gpu_id, batch_size):
6364
:param handler:
6465
:param gpu_id:
6566
:param batch_size:
67+
:param envelope:
6668
:return:
6769
"""
6870
logging.debug("Loading model - working dir: %s", os.getcwd())
@@ -74,48 +76,71 @@ def load(self, model_name, model_dir, handler, gpu_id, batch_size):
7476
with open(manifest_file) as f:
7577
manifest = json.load(f)
7678

79+
function_name = None
7780
try:
78-
temp = handler.split(":", 1)
79-
module_name = temp[0]
80-
function_name = None if len(temp) == 1 else temp[1]
81-
if module_name.endswith(".py"):
82-
module_name = module_name[:-3]
83-
module_name = module_name.split("/")[-1]
84-
module = importlib.import_module(module_name)
85-
# pylint: disable=unused-variable
86-
except ImportError as e:
87-
module_name = ".{0}".format(handler)
88-
module = importlib.import_module(module_name, 'ts.torch_handler')
89-
function_name = None
81+
module, function_name = self._load_handler_file(handler)
82+
except ImportError:
83+
module = self._load_default_handler(handler)
9084

9185
if module is None:
9286
raise ValueError("Unable to load module {}, make sure it is added to python path".format(module_name))
93-
if function_name is None:
94-
function_name = "handle"
95-
if hasattr(module, function_name):
96-
entry_point = getattr(module, function_name)
97-
service = Service(model_name, model_dir, manifest, entry_point, gpu_id, batch_size)
9887

99-
service.context.metrics = metrics
100-
# initialize model at load time
101-
entry_point(None, service.context)
88+
envelope_class = None
89+
if envelope is not None:
90+
envelope_class = self._load_default_envelope(envelope)
91+
92+
function_name = function_name or "handle"
93+
if hasattr(module, function_name):
94+
entry_point, initialize_fn = self._get_function_entry_point(module, function_name)
10295
else:
103-
model_class_definitions = list_classes_from_module(module)
104-
if len(model_class_definitions) != 1:
105-
raise ValueError("Expected only one class in custom service code or a function entry point {}".format(
106-
model_class_definitions))
107-
108-
model_class = model_class_definitions[0]
109-
model_service = model_class()
110-
handle = getattr(model_service, "handle")
111-
if handle is None:
112-
raise ValueError("Expect handle method in class {}".format(str(model_class)))
113-
114-
service = Service(model_name, model_dir, manifest, model_service.handle, gpu_id, batch_size)
115-
initialize = getattr(model_service, "initialize")
116-
if initialize is not None:
117-
model_service.initialize(service.context)
118-
else:
119-
raise ValueError("Expect initialize method in class {}".format(str(model_class)))
96+
entry_point, initialize_fn = self._get_class_entry_point(module)
97+
98+
if envelope_class is not None:
99+
envelope_instance = envelope_class(entry_point)
100+
entry_point = envelope_instance.handle
101+
102+
service = Service(model_name, model_dir, manifest, entry_point, gpu_id, batch_size)
103+
service.context.metrics = metrics
104+
initialize_fn(service.context)
120105

121106
return service
107+
108+
def _load_handler_file(self, handler):
109+
temp = handler.split(":", 1)
110+
module_name = temp[0]
111+
function_name = None if len(temp) == 1 else temp[1]
112+
if module_name.endswith(".py"):
113+
module_name = module_name[:-3]
114+
module_name = module_name.split("/")[-1]
115+
module = importlib.import_module(module_name)
116+
return module, function_name
117+
118+
def _load_default_handler(self, handler):
119+
module_name = ".{0}".format(handler)
120+
module = importlib.import_module(module_name, 'ts.torch_handler')
121+
return module
122+
123+
def _load_default_envelope(self, envelope):
124+
module_name = ".{0}".format(envelope)
125+
module = importlib.import_module(module_name, 'ts.torch_handler.request_envelope')
126+
envelope_class = list_classes_from_module(module)[0]
127+
return envelope_class
128+
129+
def _get_function_entry_point(self, module, function_name):
130+
entry_point = getattr(module, function_name)
131+
initialize_fn = lambda ctx: entry_point(None, ctx)
132+
return entry_point, initialize_fn
133+
134+
def _get_class_entry_point(self, module):
135+
model_class_definitions = list_classes_from_module(module)
136+
if len(model_class_definitions) != 1:
137+
raise ValueError("Expected only one class in custom service code or a function entry point {}".format(
138+
model_class_definitions))
139+
140+
model_class = model_class_definitions[0]
141+
model_service = model_class()
142+
143+
if not hasattr(model_service, "handle"):
144+
raise ValueError("Expect handle method in class {}".format(str(model_class)))
145+
146+
return model_service.handle, model_service.initialize

0 commit comments

Comments
 (0)