Skip to content

Commit 10b98ac

Browse files
n8mellisbmunday3
andauthored
Added new Edge inference API (#47)
* Added new Edge inference API Also regenerated all Python protobuf/gRPC files based on Protobuf v21.x which includes the new format using Python stub files (*.pyi). * Modified README to reflect new methods, updated docstrings for Edge Inference and Edge Job API methods, and added convenience import for InputSource Co-authored-by: bmunday3 <[email protected]>
1 parent 9aacf67 commit 10b98ac

40 files changed

+3604
-4033
lines changed

README.md

+91-10
Original file line numberDiff line numberDiff line change
@@ -213,36 +213,103 @@ To use **`client.models.deploy()`** there are 4 fields that are required:
213213

214214
## Running Inferences at the Edge
215215

216-
The SDK provides support for running inferences on edge devices through Modzy's Edge Client. The inference workflow is almost identical to the previously outlined workflow:
216+
The SDK provides support for running inferences on edge devices through Modzy's Edge Client. The inference workflow is almost identical to the previously outlined workflow, and provides functionality for interacting with both Job and Inferences APIs:
217217

218-
### Initialize *Edge* Client
218+
### Initialize Edge Client
219219

220220
```python
221-
from modzy.edge.client import EdgeClient
221+
from modzy import EdgeClient
222222

223223
# Initialize edge client
224224
# Use 'localhost' for local inferences, otherwise use the device's full IP address
225225
client = EdgeClient('localhost',55000)
226226
```
227227

228-
### Submit Inference Job
228+
### Submit Inference with *Job* API
229229
Modzy Edge supports `text`, `embedded`, and `aws-s3` input types.
230230

231231
```python
232232
# Submit text job to Sentiment Analysis model deployed on edge device by providing a model ID, version, and raw text data, wait for completion
233-
job = client.submit_text("ed542963de","1.0.27",{"input.txt": "this is awesome"})
233+
job = client.jobs.submit_text("ed542963de","1.0.27",{"input.txt": "this is awesome"})
234234
# Block until results are ready
235-
final_job_details = client.block_until_complete(job)
236-
results = client.get_results(job)
235+
final_job_details = client.jobs.block_until_complete(job)
236+
results = client.jobs.get_results(job)
237237
```
238238

239-
### Query Details about Edge Jobs
239+
### Query Details about Inference with *Job* API
240240
```python
241241
# get job details for a particular job
242-
job_details = client.get_job_details(job)
242+
job_details = client.jobs.get_job_details(job)
243243

244244
# get job details for all jobs run on your Modzy Edge instance
245-
all_job_details = client.get_all_job_details()
245+
all_job_details = client.jobs.get_all_job_details()
246+
```
247+
248+
### Submit Inference with *Inference* API
249+
250+
The SDK provides several methods for interacting with Modzy's Inference API:
251+
* **Synchronous**: This convenience method wraps two SDK methods and is optimal for use cases that require real-time or sequential results (i.e., a prediction results are needed to inform action before submitting a new inference)
252+
* **Asynchronous**: This method combines two SDK methods and is optimal for submitting large batches of data and querying results at a later time (i.e., real-time inference is not required)
253+
* **Streaming**: This method is a convenience method for running multiple synchronous inferences consecutively and allows users to submit iterable objects to be processed sequentially in real-time
254+
255+
*Synchronous (image-based model example)*
256+
```python
257+
from modzy import EdgeClient
258+
from modzy.edge import InputSource
259+
260+
image_bytes = open("image_path.jpg", "rb").read()
261+
input_object = InputSource(
262+
key="image", # input filename defined by model author
263+
data=image_bytes,
264+
)
265+
266+
with EdgeClient('localhost', 55000) as client:
267+
inference = client.inferences.run("<model-id>", "<model-version>", input_object, explain=False, tags=None)
268+
results = inference.result.outputs
269+
```
270+
271+
*Asynchronous (image-based model example - submit batch of images in folder)*
272+
```python
273+
import os
274+
from modzy import EdgeClient
275+
from modzy.edge import InputSource
276+
277+
# submit inferences
278+
img_folder = "./images"
279+
inferences = []
280+
for img in os.listdir(img_folder):
281+
input_object = InputSource(
282+
key="image", # input filename defined by model author
283+
data=open(os.path.join(img_folder, img), 'rb').read()
284+
)
285+
with EdgeClient('localhost', 55000) as client:
286+
inference = client.inferences.perform_inference("<model-id>", "<model-version>", input_object, explain=False, tags=None)
287+
inferences.append(inference)
288+
289+
# query results
290+
with EdgeClient('localhost', 55000) as client:
291+
results = [client.inferences.block_until_complete(inference.identifier) for inferences in inferences]
292+
```
293+
294+
*Stream*
295+
```python
296+
import os
297+
from modzy import EdgeClient
298+
from modzy.edge import InputSource
299+
300+
# generate requests iterator to pass to stream method
301+
requests = []
302+
for img in os.listdir(img_folder):
303+
input_object = InputSource(
304+
key="image", # input filename defined by model author
305+
data=open(os.path.join(img_folder, img), 'rb').read()
306+
)
307+
with EdgeClient('localhost', 55000) as client:
308+
requests.append(client.inferences.build_inference_request("<model-id>", "<model-version>", input_object, explain=False, tags=None))
309+
310+
# submit list of inference requests to streaming API
311+
with EdgeClient('localhost', 55000) as client:
312+
streaming_results = client.inferences.stream(requests)
246313
```
247314

248315
# SDK Code Examples
@@ -285,6 +352,20 @@ Modzy's SDK is built on top of the [Modzy HTTP/REST API](https://docs.modzy.com/
285352
|Get job details|client.jobs.get()|[api/jobs/:job-id](https://docs.modzy.com/reference/get-job-details) |
286353
|Get results|job.get_result()|[api/results/:job-id](https://docs.modzy.com/reference/get-results) |
287354
|Get the job history|client.jobs.get_history()|[api/jobs/history](https://docs.modzy.com/reference/list-the-job-history) |
355+
|Submit a Job with Edge Client (Embedded)|EdgeClient.jobs.submit_embedded()|[Python/edge/jobs](https://docs.modzy.com/docs/edgeclientjobssubmit_embedded) |
356+
|Submit a Job with Edge Client (Text)|EdgeClient.jobs.submit_text()|[Python/edge/jobs](https://docs.modzy.com/docs/edgeclientjobssubmit_text) |
357+
|Submit a Job with Edge Client (AWS S3)|EdgeClient.jobs.submit_aws_s3()|[Python/edge/jobs](https://docs.modzy.com/docs/edgeclientjobssubmit_aws_s3) |
358+
|Get job details with Edge Client|EdgeClient.jobs.get_job_details()|[Python/edge/jobs](https://docs.modzy.com/docs/edgeclientjobsget_job_details) |
359+
|Get all job details with Edge Client|EdgeClient.jobs.get_all_job_details()|[Python/edge/jobs](https://docs.modzy.com/docs/edgeclientjobsget_all_job_details) |
360+
|Hold until job is complete with Edge Client|EdgeClient.jobs.block_until_complete()|[Python/edge/jobs](https://docs.modzy.com/docs/edgeclientjobsblock_until_complete) |
361+
|Get results with Edge Client|EdgeClient.jobs.get_results()|[Python/edge/jobs](https://docs.modzy.com/docs/edgeclientjobsget_results) |
362+
|Build inference request with Edge Client|EdgeClient.inferences.build_inference_request()|[Python/edge/inferences](https://docs.modzy.com/docs/edgeclientinferencesbuild_inference_request) |
363+
|Perform inference with Edge Client|EdgeClient.inferences.perform_inference()|[Python/edge/inferences](https://docs.modzy.com/docs/edgeclientinferencesperform_inference) |
364+
|Get inference details with Edge Client|EdgeClient.inferences.get_inference_details()|[Python/edge/inferences](https://docs.modzy.com/docs/edgeclientinferencesget_inference_details) |
365+
|Run synchronous inferences with Edge Client|EdgeClient.inferences.run()|[Python/edge/inferences](https://docs.modzy.com/docs/edgeclientinferencesrun) |
366+
|Hold until inference completes with Edge Client|EdgeClient.inferences.block_until_complete()|[Python/edge/inferences](https://docs.modzy.com/docs/edgeclientinferencesblock_until_complete) |
367+
|Stream inferences with Edge Client|EdgeClient.inferences.stream()|[Python/edge/inferences](https://docs.modzy.com/docs/edgeclientinferencesstream) |
368+
288369

289370
# Support
290371

modzy/_util.py

+53-46
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,20 @@
11
# -*- coding: utf-8 -*-
2-
import json
32
import pathlib
43
import time
5-
from .error import NetworkError, InternalServerError
4+
from base64 import b64encode
5+
66
from requests.adapters import HTTPAdapter
77
from requests.packages.urllib3.util.retry import Retry
8-
from base64 import b64encode
8+
9+
from .error import InternalServerError, NetworkError
10+
911

1012
def encode_data_uri(bytes_like, mimetype='application/octet-stream'):
1113
encoded = b64encode(bytes_like).decode('ascii')
1214
data_uri = 'data:{};base64,{}'.format(mimetype, encoded)
1315
return data_uri
1416

17+
1518
def file_to_bytes(file_like):
1619
if hasattr(file_like, 'read'): # File-like object
1720
if hasattr(file_like, 'seekable') and file_like.seekable():
@@ -35,6 +38,7 @@ def file_to_bytes(file_like):
3538
with open(path, 'rb') as file:
3639
return file.read()
3740

41+
3842
def file_to_chunks(file_like, chunk_size):
3943
file = None
4044
should_close = False
@@ -67,20 +71,24 @@ def file_to_chunks(file_like, chunk_size):
6771
if should_close:
6872
file.close()
6973

74+
7075
def bytes_to_chunks(byte_array, chunk_size):
7176
for i in range(0, len(byte_array), chunk_size):
7277
yield byte_array[i:i + chunk_size]
7378

79+
7480
def depth(d):
7581
if d and isinstance(d, dict):
7682
return max(depth(v) for k, v in d.items()) + 1
7783
return 0
7884

85+
7986
'''
8087
Model Deployment (models.deploy()) specific utilities
8188
'''
82-
def load_model(client, logger, identifier, version):
8389

90+
91+
def load_model(client, logger, identifier, version):
8492
start = time.time()
8593
# Before loading the model we need to ensure that it has been pulled.
8694
percentage = -1
@@ -89,9 +97,9 @@ def load_model(client, logger, identifier, version):
8997
res = client.http.get(f"/models/{identifier}/versions/{version}/container-image")
9098
new_percentage = res.get("percentage")
9199
except NetworkError:
92-
continue
100+
continue
93101
except InternalServerError:
94-
continue
102+
continue
95103

96104
if new_percentage != percentage:
97105
logger.info(f'Loading model at {new_percentage}%')
@@ -112,25 +120,25 @@ def load_model(client, logger, identifier, version):
112120
try:
113121
res = client.http.post(f"/models/{identifier}/versions/{version}/load-process")
114122
except NetworkError:
115-
return
123+
return
116124
except InternalServerError:
117-
return
125+
return
118126

119-
logger.info(f'Loading container image took [{1000*(time.time()-start)} ms]')
127+
logger.info(f'Loading container image took [{1000 * (time.time() - start)} ms]')
120128

121-
def upload_input_example(client, logger, identifier, version, model_data_metadata, input_sample_path):
122129

130+
def upload_input_example(client, logger, identifier, version, model_data_metadata, input_sample_path):
123131
start = time.time()
124132

125133
input_filename = model_data_metadata['inputs'][0]['name']
126134
files = {'file': open(input_sample_path, 'rb')}
127135
params = {'name': input_filename}
128136
res = client.http.post(f"/models/{identifier}/versions/{version}/testInput", params=params, file_data=files)
129137

130-
logger.info(f'Uploading sample input took [{1000*(time.time()-start)} ms]')
138+
logger.info(f'Uploading sample input took [{1000 * (time.time() - start)} ms]')
131139

132-
def run_model(client, logger, identifier, version):
133140

141+
def run_model(client, logger, identifier, version):
134142
start = time.time()
135143
res = client.http.post(f"/models/{identifier}/versions/{version}/run-process")
136144

@@ -155,50 +163,49 @@ def run_model(client, logger, identifier, version):
155163
raise ValueError(f'Sample inference test failed with error {test_output["error"]}. Check model container and try again.')
156164

157165
sample_input = {'input': {'accessKeyID': '<accessKeyID>',
158-
'region': '<region>',
159-
'secretAccessKey': '<secretAccessKey>',
160-
'sources': {'0001': {'input': {'bucket': '<bucket>',
161-
'key': '/path/to/s3/input'}}},
162-
'type': 'aws-s3'},
163-
'model': {'identifier': identifier, 'version':version}
166+
'region': '<region>',
167+
'secretAccessKey': '<secretAccessKey>',
168+
'sources': {'0001': {'input': {'bucket': '<bucket>',
169+
'key': '/path/to/s3/input'}}},
170+
'type': 'aws-s3'},
171+
'model': {'identifier': identifier, 'version': version}
164172
}
165-
173+
166174
formatted_sample_output = {'jobIdentifier': '<uuid>',
167-
'total': '<number of inputs>',
168-
'completed': '<total number of completed inputs>',
169-
'failed': '<number of failed inputs>',
170-
'finished': '<true or false>',
171-
'submittedByKey': '<api key>',
172-
'results': {'<input-id>': {'model': None,
173-
'userIdentifier': None,
174-
'status': test_output['status'],
175-
'engine': test_output['engine'],
176-
'error': test_output['error'],
177-
'startTime': test_output['startTime'],
178-
'endTime': test_output['endTime'],
179-
'updateTime': test_output['updateTime'],
180-
'inputSize': test_output['inputSize'],
181-
'accessKey': None,
182-
'teamIdentifier': None,
183-
'accountIdentifier': None,
184-
'timeMeters': None,
185-
'datasourceCompletedTime': None,
186-
'elapsedTime': test_output['elapsedTime'],
187-
'results.json': test_output['results.json']}
188-
}
189-
}
175+
'total': '<number of inputs>',
176+
'completed': '<total number of completed inputs>',
177+
'failed': '<number of failed inputs>',
178+
'finished': '<true or false>',
179+
'submittedByKey': '<api key>',
180+
'results': {'<input-id>': {'model': None,
181+
'userIdentifier': None,
182+
'status': test_output['status'],
183+
'engine': test_output['engine'],
184+
'error': test_output['error'],
185+
'startTime': test_output['startTime'],
186+
'endTime': test_output['endTime'],
187+
'updateTime': test_output['updateTime'],
188+
'inputSize': test_output['inputSize'],
189+
'accessKey': None,
190+
'teamIdentifier': None,
191+
'accountIdentifier': None,
192+
'timeMeters': None,
193+
'datasourceCompletedTime': None,
194+
'elapsedTime': test_output['elapsedTime'],
195+
'results.json': test_output['results.json']}
196+
}
197+
}
190198

191199
sample_input_res = client.http.put(f"/models/{identifier}/versions/{version}/sample-input", json_data=sample_input)
192200
sample_output_res = client.http.put(f"/models/{identifier}/versions/{version}/sample-output", json_data=formatted_sample_output)
193201

194-
logger.info(f'Inference test took [{1000*(time.time()-start)} ms]')
202+
logger.info(f'Inference test took [{1000 * (time.time() - start)} ms]')
195203

196-
def deploy_model(client, logger, identifier, version):
197204

205+
def deploy_model(client, logger, identifier, version):
198206
start = time.time()
199207
status = {'status': 'active'}
200208

201209
res = client.http.patch(f"/models/{identifier}/versions/{version}", status)
202210

203-
logger.info(f'Model Deployment took [{1000*(time.time()-start)} ms]')
204-
211+
logger.info(f'Model Deployment took [{1000 * (time.time() - start)} ms]')

modzy/edge/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
from modzy.edge.proto.inferences.api.v1.inferences_pb2 import InputSource

0 commit comments

Comments
 (0)