Skip to content

Commit 8149ec6

Browse files
authored
Merge pull request #83 from zixianwang2022/mlperf-inference-results-scc24
UCSD Optimized Results on system MI210
2 parents 6af1d5f + 516097c commit 8149ec6

31 files changed

+1104
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
| Model | Scenario | Accuracy | Throughput | Latency (in ms) |
2+
|---------------------|------------|----------------------|--------------|-------------------|
3+
| stable-diffusion-xl | offline | (15.22477, 84.24318) | 0.848 | - |
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
This experiment is generated using the [MLCommons Collective Mind automation framework (CM)](https://github.com/mlcommons/cm4mlops).
2+
3+
*Check [CM MLPerf docs](https://docs.mlcommons.org/inference) for more details.*
4+
5+
## Host platform
6+
7+
* OS version: Linux-5.14.0-427.42.1.el9_4.x86_64-x86_64-with-glibc2.35
8+
* CPU version: x86_64
9+
* Python version: 3.10.15 (main, Oct 3 2024, 07:27:34) [GCC 11.2.0]
10+
* MLCommons CM version: 3.4.1
11+
12+
## CM Run Command
13+
14+
See [CM installation guide](https://docs.mlcommons.org/inference/install/).
15+
16+
```bash
17+
pip install -U cmind
18+
19+
cm rm cache -f
20+
21+
cm pull repo mlcommons@cm4mlops --checkout=b32ded2a4c3039ad16dadc734bee03dd1a97f228
22+
23+
cm run script \
24+
--tags=run-mlperf,inference,_r4.1-dev,_scc24-main \
25+
--model=sdxl \
26+
--framework=pytorch \
27+
--category=datacenter \
28+
--scenario=Offline \
29+
--execution_mode=test \
30+
--device=rocm \
31+
--quiet \
32+
--precision=float16 \
33+
--adr.mlperf-implementation.tags=_branch.multinode-test,_repo.https://github.com/zixianwang2022/mlperf-scc24 \
34+
--adr.mlperf-implementation.version=custom \
35+
--env.CM_GET_PLATFORM_DETAILS=no \
36+
--target_qps=1.8
37+
```
38+
*Note that if you want to use the [latest automation recipes](https://docs.mlcommons.org/inference) for MLPerf (CM scripts),
39+
you should simply reload mlcommons@cm4mlops without checkout and clean CM cache as follows:*
40+
41+
```bash
42+
cm rm repo mlcommons@cm4mlops
43+
cm pull repo mlcommons@cm4mlops
44+
cm rm cache -f
45+
46+
```
47+
48+
## Results
49+
50+
Platform: aqua-reference-rocm-pytorch-v2.6.0.dev20241118-scc24-main
51+
52+
Model Precision: fp32
53+
54+
### Accuracy Results
55+
`CLIP_SCORE`: `15.22477`, Required accuracy for closed division `>= 31.68632` and `<= 31.81332`
56+
`FID_SCORE`: `84.24318`, Required accuracy for closed division `>= 23.01086` and `<= 23.95008`
57+
58+
### Performance Results
59+
`Samples per second`: `0.847525`

Diff for: open/UCSD/measurements/aqua-reference-rocm-pytorch-v2.6.0.dev20241118-scc24-main/stable-diffusion-xl/offline/accuracy_console.out

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
INFO:main:Namespace(sut_server=['http://10.0.0.14:8008', 'http://10.0.0.12:8008'], dataset='coco-1024', dataset_path='/root/CM/repos/local/cache/61dd835801c542a3/install', profile='stable-diffusion-xl-pytorch', scenario='Offline', max_batchsize=1, threads=1, accuracy=True, find_peak_performance=False, backend='pytorch', model_name='stable-diffusion-xl', output='/root/CM/repos/local/cache/d549713c4a534705/test_results/aqua-reference-rocm-pytorch-v2.6.0.dev20241118-scc24-main/stable-diffusion-xl/offline/accuracy', qps=None, model_path='/root/CM/repos/local/cache/c4b6bbbebe504f28/stable_diffusion_fp16', dtype='fp16', device='cuda', latent_framework='torch', mlperf_conf='mlperf.conf', user_conf='/root/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/6626c9658bff4d2291e3121038a4cfca.conf', audit_conf='audit.config', ids_path='/root/CM/repos/local/cache/61dd835801c542a3/install/sample_ids.txt', time=None, count=10, debug=False, performance_sample_count=5000, max_latency=None, samples_per_query=8)
2+
WARNING:backend-pytorch:Model path not provided, running with default hugging face weights
3+
This may not be valid for official submissions
4+
Keyword arguments {'safety_checker': None} are not expected by StableDiffusionXLPipeline and will be ignored.
5+
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]Using the `SDPA` attention implementation on multi-gpu setup with ROCM may lead to performance issues due to the FA backend. Disabling it to use alternative backends.
6+
Loading pipeline components...: 57%|█████▋ | 4/7 [00:00<00:00, 12.44it/s]Loading pipeline components...: 86%|████████▌ | 6/7 [00:00<00:00, 7.56it/s]Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 9.15it/s]
7+
RETURNED from requests.post on predict at time 1731969667.0400865
8+
BEFORE lg.QuerySamplesComplete(response)
9+
AFTER lg.QuerySamplesComplete(response)
10+
RETURNED from requests.post on predict at time 1731969689.698808
11+
BEFORE lg.QuerySamplesComplete(response)
12+
AFTER lg.QuerySamplesComplete(response)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
{
2+
"starting_weights_filename": "https://github.com/mlcommons/inference/tree/master/text_to_image#download-model",
3+
"retraining": "no",
4+
"input_data_types": "fp32",
5+
"weight_data_types": "fp32",
6+
"weight_transformations": "no"
7+
}

0 commit comments

Comments
 (0)