Skip to content

Commit dfa685f

Browse files
authored
improve the segment-anything example (#385)
* improve the example * remove some text * remove provider selection from ui * point to sd-turbo * make wasm work on int8
1 parent e09a4a6 commit dfa685f

File tree

7 files changed

+534
-371
lines changed

7 files changed

+534
-371
lines changed

js/README.md

+2
Original file line numberDiff line numberDiff line change
@@ -47,3 +47,5 @@ Click links for README of each examples.
4747
* [OpenAI Whisper](ort-whisper) - demonstrates how to run [whisper tiny.en](https://github.com/openai/whisper) in your browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime) and the browser's audio interfaces.
4848

4949
* [Facebook Segment-Anything](segment-anything) - demonstrates how to run [segment-anything](https://github.com/facebookresearch/segment-anything) in your browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime/js) with webgpu.
50+
51+
* [Stable Diffusion Turbo](sd-turbo) - demonstrates how to run [Stable Diffusion Turbo](https://huggingface.co/stabilityai/sd-turbo) in your browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime/js) with webgpu.

js/segment-anything/README.md

+29-45
Original file line numberDiff line numberDiff line change
@@ -1,63 +1,47 @@
1-
# Run Segment-Anything in your browser using webgpu and onnxruntime-web
1+
# Segment-Anything: Browser-Based Image Segmentation with WebGPU and ONNX Runtime Web
22

3-
This example demonstrates how to run [Segment-Anything](https://github.com/facebookresearch/segment-anything) in your
4-
browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime) and webgpu.
3+
This repository contains an example of running [Segment-Anything](https://github.com/facebookresearch/segment-anything), an encoder/decoder model for image segmentation, in a browser using [ONNX Runtime Web](https://github.com/microsoft/onnxruntime) with WebGPU.
54

6-
Segment-Anything is a encoder/decoder model. The encoder creates embeddings and using the embeddings the decoder creates the segmentation mask.
5+
You can try out the live demo [here](https://guschmue.github.io/ort-webgpu/segment-anything/index.html).
76

8-
One can run the decoder in onnxruntime-web using WebAssembly with latencies at ~200ms.
7+
## Model Overview
98

10-
The encoder is much more compute intensive and takes ~45sec using WebAssembly what is not practical.
11-
Using webgpu we can speedup the encoder ~50 times and it becomes visible to run it inside the browser, even on a integrated GPU.
9+
Segment-Anything creates embeddings for an image using an encoder. These embeddings are then used by the decoder to create and update the segmentation mask. The decoder can run in ONNX Runtime Web using WebAssembly with latencies at ~200ms.
1210

13-
## Usage
11+
The encoder is more compute-intensive, taking ~45sec in WebAssembly, which is not practical. However, by using WebGPU, we can speed up the encoder, making it feasible to run it inside the browser, even on an integrated GPU.
12+
13+
## Getting Started
14+
15+
### Prerequisites
16+
17+
Ensure that you have [Node.js](https://nodejs.org/) installed on your machine.
1418

1519
### Installation
16-
First, install the required dependencies by running the following command in your terminal:
20+
21+
1. Install the required dependencies:
22+
1723
```sh
1824
npm install
1925
```
2026

21-
### Build the code
22-
Next, bundle the code using webpack by running:
27+
### Building the Project
28+
29+
1. Bundle the code using webpack:
30+
2331
```sh
2432
npm run build
2533
```
26-
this generates the bundle file `./dist/bundle.min.js`
2734

28-
### Create an ONNX Model
35+
This command generates the bundle file `./dist/index.js`.
2936

30-
We use [samexporter](https://github.com/vietanhdev/samexporter) to export encoder and decoder to onnx.
31-
Install samexporter:
32-
```sh
33-
pip install https://github.com/vietanhdev/samexporter
34-
```
35-
Download the pytorch model from [Segment-Anything](https://github.com/facebookresearch/segment-anything). We use the smallest flavor (vit_b).
36-
```sh
37-
curl -o models/sam_vit_b_01ec64.pth https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
38-
```
39-
Export both encoder and decoder to onnx:
40-
```sh
41-
python -m samexporter.export_encoder --checkpoint models/sam_vit_b_01ec64.pth \
42-
--output models/sam_vit_b_01ec64.encoder.onnx \
43-
--model-type vit_b
44-
45-
python -m samexporter.export_decoder --checkpoint models/sam_vit_b_01ec64.pth \
46-
--output models/sam_vit_b_01ec64.decoder.onnx \
47-
--model-type vit_b \
48-
--return-single-mask
49-
```
50-
### Start a web server
51-
Use NPM package `light-server` to serve the current folder at http://localhost:8888/.
52-
To start the server, run:
53-
```sh
54-
npx light-server -s . -p 8888
55-
```
37+
### The ONNX Model
5638

57-
### Point your browser at the web server
58-
Once the web server is running, open your browser and navigate to http://localhost:8888/.
59-
You should now be able to run Segment-Anything in your browser.
39+
The model used in this project is hosted on [Hugging Face](https://huggingface.co/schmuell/sam-b-fp16). It was created using [samexporter](https://github.com/vietanhdev/samexporter).
6040

61-
## TODO
62-
* add support for fp16
63-
* add support for MobileSam
41+
### Running the Project
42+
43+
Start a web server to serve the current folder at http://localhost:8888/. To start the server, run:
44+
45+
```sh
46+
npm run dev
47+
```

js/segment-anything/index.html

+14-13
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
<head>
44
<meta charset="utf-8">
55
<meta name="viewport" content="width=device-width, initial-scale=1">
6-
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.2.3/dist/css/bootstrap.min.css" rel="stylesheet"
7-
integrity="sha384-rbsA2VBKQhggwzxH7pPCaAqO46MgnOM80zW1RWuH61DGLwZJEdK2Kadq2F9CUG65" crossorigin="anonymous">
8-
<script src="./dist/bundle.min.js"></script>
6+
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.1/dist/css/bootstrap.min.css" rel="stylesheet"
7+
integrity="sha384-4bw+/aepP/YC94hEpVNVgiZdgIC5+VKNBQNGCHeKRQN+PtmoHDEXuppvnDJzQIu9" crossorigin="anonymous" />
8+
<script type="module" src="dist/index.js"></script>
99

1010
<style>
1111
/* Add rounded corners to blocks */
@@ -23,7 +23,7 @@
2323
left: 50%;
2424
transform: translate(-50%, -50%);
2525
padding: 5px 10px;
26-
background-color: white;
26+
background-color: #212529;
2727
font-size: 18px;
2828
}
2929

@@ -38,7 +38,7 @@
3838

3939
</head>
4040

41-
<body>
41+
<body data-bs-theme="dark">
4242
<title>segment anything example</title>
4343
<div class="container-fluid">
4444
<h2>segment anything example</h2>
@@ -71,15 +71,16 @@ <h4>Latencies</h4>
7171
accept=".jpg, .png, .jpeg, .gif, .bmp, .tif, .tiff|image/*">
7272
</div>
7373
</form>
74+
<div class="form-group ">
75+
<button id="cut-button" type="button" class="btn btn-primary">Cut</button>
76+
<button id="clear-button" type="button" class="btn btn-primary">Clear</button>
77+
</div>
78+
<div style="margin-top: 30px;">
79+
<div>Other providers:</div>
80+
<a href="index.html?provider=wasm&model=sam_b_int8">wasm</a>
81+
<a href="index.html?provider=webgpu&model=sam_b">webgpu</a>
82+
</div>
7483
</div>
75-
<div style="margin-top: 30px;">
76-
<div>Other providers:</div>
77-
<a href="index.html?provider=wasm">wasm</a>
78-
<a href="index.html?provider=webgpu">webgpu</a>
79-
<a href="index.html?provider=webnn">webnn</a>
80-
</div>
81-
82-
</div>
8384

8485
<p class="text-lg-start">
8586
<div id="status" style="font: 1em consolas;"></div>

0 commit comments

Comments
 (0)