Skip to content

Commit b6dd747

Browse files
authored
segment-anything example (#327)
* segment-anything demo * lint * lint * handle exceptions on load_model * handle window size better * fix typo
1 parent 07aaa43 commit b6dd747

File tree

8 files changed

+495
-1
lines changed

8 files changed

+495
-1
lines changed

js/README.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -44,4 +44,6 @@ Click links for README of each examples.
4444

4545
### Simple Applications
4646

47-
* [OpenAI Whisper](ort-whisper) - demonstrates how to run [whisper tiny.en](https://github.com/openai/whisper) in your browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime) and the browser's audio interfaces.
47+
* [OpenAI Whisper](ort-whisper) - demonstrates how to run [whisper tiny.en](https://github.com/openai/whisper) in your browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime) and the browser's audio interfaces.
48+
49+
* [Facebook Segment-Anything](segment-anything) - demonstrates how to run [segment-anything](https://github.com/facebookresearch/segment-anything) in your browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime/js) with webgpu.

js/segment-anything/.eslintrc.js

+17
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
module.exports = {
2+
root: true,
3+
ignorePatterns: ["node_modules/", "dist/", "models/"],
4+
"env": {
5+
"browser": true,
6+
"commonjs": true,
7+
"es2021": true
8+
},
9+
"extends": "eslint:recommended",
10+
"overrides": [
11+
],
12+
"parserOptions": {
13+
"ecmaVersion": "latest"
14+
},
15+
"rules": {
16+
}
17+
}

js/segment-anything/EgyptianCat.png

346 KB
Loading

js/segment-anything/README.md

+63
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Run Segment-Anything in your browser using webgpu and onnxruntime-web
2+
3+
This example demonstrates how to run [Segment-Anything](https://github.com/facebookresearch/segment-anything) in your
4+
browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime) and webgpu.
5+
6+
Segment-Anything is a encoder/decoder model. The encoder creates embeddings and using the embeddings the decoder creates the segmentation mask.
7+
8+
One can run the decoder in onnxruntime-web using WebAssembly with latencies at ~200ms.
9+
10+
The encoder is much more compute intensive and takes ~45sec using WebAssembly what is not practical.
11+
Using webgpu we can speedup the encoder ~50 times and it becomes visible to run it inside the browser, even on a integrated GPU.
12+
13+
## Usage
14+
15+
### Installation
16+
First, install the required dependencies by running the following command in your terminal:
17+
```sh
18+
npm install
19+
```
20+
21+
### Build the code
22+
Next, bundle the code using webpack by running:
23+
```sh
24+
npm run build
25+
```
26+
this generates the bundle file `./dist/bundle.min.js`
27+
28+
### Create an ONNX Model
29+
30+
We use [samexporter](https://github.com/vietanhdev/samexporter) to export encoder and decoder to onnx.
31+
Install samexporter:
32+
```sh
33+
pip install https://github.com/vietanhdev/samexporter
34+
```
35+
Download the pytorch model from [Segment-Anything](https://github.com/facebookresearch/segment-anything). We use the smallest flavor (vit_b).
36+
```sh
37+
curl -o models/sam_vit_b_01ec64.pth https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
38+
```
39+
Export both encoder and decoder to onnx:
40+
```sh
41+
python -m samexporter.export_encoder --checkpoint models/sam_vit_b_01ec64.pth \
42+
--output models/sam_vit_b_01ec64.encoder.onnx \
43+
--model-type vit_b
44+
45+
python -m samexporter.export_decoder --checkpoint models/sam_vit_b_01ec64.pth \
46+
--output models/sam_vit_b_01ec64.decoder.onnx \
47+
--model-type vit_b \
48+
--return-single-mask
49+
```
50+
### Start a web server
51+
Use NPM package `light-server` to serve the current folder at http://localhost:8888/.
52+
To start the server, run:
53+
```sh
54+
npx light-server -s . -p 8888
55+
```
56+
57+
### Point your browser at the web server
58+
Once the web server is running, open your browser and navigate to http://localhost:8888/.
59+
You should now be able to run Segment-Anything in your browser.
60+
61+
## TODO
62+
* add support for fp16
63+
* add support for MobileSam

js/segment-anything/index.html

+88
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
<html>
2+
3+
<head>
4+
<meta charset="utf-8">
5+
<meta name="viewport" content="width=device-width, initial-scale=1">
6+
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet"
7+
integrity="sha384-rbsA2VBKQhggwzxH7pPCaAqO46MgnOM80zW1RWuH61DGLwZJEdK2Kadq2F9CUG65" crossorigin="anonymous">
8+
<script src="./dist/bundle.min.js"></script>
9+
10+
<style>
11+
/* Add rounded corners to blocks */
12+
.rounded-block {
13+
border-radius: 10px;
14+
border: 1px solid #ccc;
15+
padding: 20px;
16+
position: relative;
17+
}
18+
19+
/* Move text inside the border */
20+
.rounded-block h4 {
21+
position: absolute;
22+
top: 00%;
23+
left: 50%;
24+
transform: translate(-50%, -50%);
25+
padding: 5px 10px;
26+
background-color: white;
27+
font-size: 18px;
28+
}
29+
30+
.higlight {
31+
display: inline-block;
32+
}
33+
34+
.higlight:hover {
35+
border: 1px solid red;
36+
}
37+
</style>
38+
39+
</head>
40+
41+
<body>
42+
<title>segment anything example</title>
43+
<div class="container-fluid">
44+
<h2>segment anything example</h2>
45+
46+
<br>
47+
48+
<div style="display: none;">
49+
<img id="original-image" src="EgyptianCat.png" />
50+
</div>
51+
52+
<div>
53+
<div class="row">
54+
<div class="col">
55+
<canvas id="img_canvas"></canvas>
56+
</div>
57+
<div class="col">
58+
<div class="rounded-block" style="margin-top: 40px; max-width: 200px;">
59+
<h4>Latencies</h4>
60+
<div style="margin-top: 10px;">
61+
encoder: <div id="encoder_latency" class="higlight"></div>
62+
</div>
63+
<div style="margin-top: 10px;">
64+
decoder: <div id="decoder_latency" class="higlight"></div>
65+
</div>
66+
</div>
67+
<div style="margin-top: 40px;">
68+
<form>
69+
<div class="form-group ">
70+
<input title="Upload Image" type="file" id="file-in" name="file-in"
71+
accept=".jpg, .png, .jpeg, .gif, .bmp, .tif, .tiff|image/*">
72+
</div>
73+
</form>
74+
</div>
75+
<div style="margin-top: 30px;">
76+
<div>Other providers:</div>
77+
<a href="index.html?provider=wasm">wasm</a>
78+
<a href="index.html?provider=webgpu">webgpu</a>
79+
<a href="index.html?provider=webnn">webnn</a>
80+
</div>
81+
82+
</div>
83+
84+
<p class="text-lg-start">
85+
<div id="status" style="font: 1em consolas;"></div>
86+
</p>
87+
</div>
88+
</body>

0 commit comments

Comments
 (0)