Skip to content

Commit f196adf

Browse files
authored
Merge pull request #70 from woonyee28/mlperf-inference-results-scc24
Results on system 4xH100
2 parents a50a424 + 9505f45 commit f196adf

File tree

22 files changed

+1235
-0
lines changed

22 files changed

+1235
-0
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
TBD
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
| Model | Scenario | Accuracy | Throughput | Latency (in ms) |
2+
|---------------------|------------|-----------------------|--------------|-------------------|
3+
| stable-diffusion-xl | offline | (15.70418, 233.56896) | 2.667 | - |
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
{
2+
"starting_weights_filename": "https://github.com/mlcommons/cm4mlops/blob/main/script/get-ml-model-stable-diffusion/_cm.json#L174",
3+
"retraining": "no",
4+
"input_data_types": "int32",
5+
"weight_data_types": "int8",
6+
"weight_transformations": "quantization, affine fusion"
7+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
This experiment is generated using the [MLCommons Collective Mind automation framework (CM)](https://github.com/mlcommons/cm4mlops).
2+
3+
*Check [CM MLPerf docs](https://docs.mlcommons.org/inference) for more details.*
4+
5+
## Host platform
6+
7+
* OS version: Linux-6.5.0-27-generic-x86_64-with-glibc2.29
8+
* CPU version: x86_64
9+
* Python version: 3.8.10 (default, Sep 11 2024, 16:02:53)
10+
[GCC 9.4.0]
11+
* MLCommons CM version: 2.4.0
12+
13+
## CM Run Command
14+
15+
See [CM installation guide](https://docs.mlcommons.org/inference/install/).
16+
17+
```bash
18+
pip install -U cmind
19+
20+
cm rm cache -f
21+
22+
cm pull repo mlcommons@cm4mlops --checkout=114709c8f6dbefa9ce5f8a599d55b349b5464bca
23+
24+
cm run script \
25+
--tags=run-mlperf,inference,_r4.1-dev,_short,_scc24-main \
26+
--model=sdxl \
27+
--implementation=nvidia \
28+
--framework=tensorrt \
29+
--category=datacenter \
30+
--scenario=Offline \
31+
--execution_mode=test \
32+
--device=cuda \
33+
--quiet \
34+
--clean \
35+
--batch-size=4 \
36+
--target_qps=40
37+
```
38+
*Note that if you want to use the [latest automation recipes](https://docs.mlcommons.org/inference) for MLPerf (CM scripts),
39+
you should simply reload mlcommons@cm4mlops without checkout and clean CM cache as follows:*
40+
41+
```bash
42+
cm rm repo mlcommons@cm4mlops
43+
cm pull repo mlcommons@cm4mlops
44+
cm rm cache -f
45+
46+
```
47+
48+
## Results
49+
50+
Platform: 8297ae0eca20-nvidia-gpu-TensorRT-scc24-main
51+
52+
Model Precision: int8
53+
54+
### Accuracy Results
55+
`CLIP_SCORE`: `15.70418`, Required accuracy for closed division `>= 31.68632` and `<= 31.81332`
56+
`FID_SCORE`: `233.56896`, Required accuracy for closed division `>= 23.01086` and `<= 23.95008`
57+
58+
### Performance Results
59+
`Samples per second`: `2.66695`

open/NTUHPC/measurements/8297ae0eca20-nvidia-gpu-TensorRT-scc24-main/stable-diffusion-xl/offline/accuracy_console.out

Whitespace-only changes.

0 commit comments

Comments
 (0)