mlcommons · zixianwang2022 · Nov 20, 2024
@@ -1,3 +1,3 @@
 | Model               | Scenario   | Accuracy             |   Throughput | Latency (in ms)   |
 |---------------------|------------|----------------------|--------------|-------------------|
-| stable-diffusion-xl | offline    | (15.22477, 84.24318) |        0.848 | -                 |
+| stable-diffusion-xl | offline    | (15.22522, 84.25505) |        1.578 | -                 |
@@ -33,7 +33,8 @@ cm run script \
 	--adr.mlperf-implementation.tags=_branch.multinode-test,_repo.https://github.com/zixianwang2022/mlperf-scc24 \
 	--adr.mlperf-implementation.version=custom \
 	--env.CM_GET_PLATFORM_DETAILS=no \
-	--target_qps=1.8
+	--target_qps=1.8 \
+	--rerun
 ```
 *Note that if you want to use the [latest automation recipes](https://docs.mlcommons.org/inference) for MLPerf (CM scripts),
  you should simply reload mlcommons@cm4mlops without checkout and clean CM cache as follows:*
@@ -52,8 +53,8 @@ Platform: aqua-reference-rocm-pytorch-v2.6.0.dev20241118-scc24-main
 Model Precision: fp32
 
 ### Accuracy Results 
-`CLIP_SCORE`: `15.22477`, Required accuracy for closed division `>= 31.68632` and `<= 31.81332`
-`FID_SCORE`: `84.24318`, Required accuracy for closed division `>= 23.01086` and `<= 23.95008`
+`CLIP_SCORE`: `15.22522`, Required accuracy for closed division `>= 31.68632` and `<= 31.81332`
+`FID_SCORE`: `84.25505`, Required accuracy for closed division `>= 23.01086` and `<= 23.95008`
 
 ### Performance Results 
-`Samples per second`: `0.847525`
+`Samples per second`: `1.5779`
@@ -1,12 +1,13 @@
-INFO:main:Namespace(sut_server=['http://10.0.0.14:8008', 'http://10.0.0.12:8008'], dataset='coco-1024', dataset_path='/root/CM/repos/local/cache/61dd835801c542a3/install', profile='stable-diffusion-xl-pytorch', scenario='Offline', max_batchsize=1, threads=1, accuracy=True, find_peak_performance=False, backend='pytorch', model_name='stable-diffusion-xl', output='/root/CM/repos/local/cache/d549713c4a534705/test_results/aqua-reference-rocm-pytorch-v2.6.0.dev20241118-scc24-main/stable-diffusion-xl/offline/accuracy', qps=None, model_path='/root/CM/repos/local/cache/c4b6bbbebe504f28/stable_diffusion_fp16', dtype='fp16', device='cuda', latent_framework='torch', mlperf_conf='mlperf.conf', user_conf='/root/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/6626c9658bff4d2291e3121038a4cfca.conf', audit_conf='audit.config', ids_path='/root/CM/repos/local/cache/61dd835801c542a3/install/sample_ids.txt', time=None, count=10, debug=False, performance_sample_count=5000, max_latency=None, samples_per_query=8)
+INFO:main:Namespace(sut_server=['http://10.0.0.14:8008', 'http://10.0.0.12:8008'], dataset='coco-1024', dataset_path='/root/CM/repos/local/cache/61dd835801c542a3/install', profile='stable-diffusion-xl-pytorch', scenario='Offline', max_batchsize=1, threads=1, accuracy=True, find_peak_performance=False, backend='pytorch', model_name='stable-diffusion-xl', output='/root/CM/repos/local/cache/d549713c4a534705/test_results/aqua-reference-rocm-pytorch-v2.6.0.dev20241118-scc24-main/stable-diffusion-xl/offline/accuracy', qps=None, model_path='/root/CM/repos/local/cache/c4b6bbbebe504f28/stable_diffusion_fp16', dtype='fp16', device='cuda', latent_framework='torch', mlperf_conf='mlperf.conf', user_conf='/root/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/1608e150c4d94edb9537a0fe9198425f.conf', audit_conf='audit.config', ids_path='/root/CM/repos/local/cache/61dd835801c542a3/install/sample_ids.txt', time=None, count=10, debug=False, performance_sample_count=5000, max_latency=None, samples_per_query=8)
 WARNING:backend-pytorch:Model path not provided, running with default hugging face weights
 This may not be valid for official submissions
 Keyword arguments {'safety_checker': None} are not expected by StableDiffusionXLPipeline and will be ignored.
 Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]Using the `SDPA` attention implementation on multi-gpu setup with ROCM may lead to performance issues due to the FA backend. Disabling it to use alternative backends.
-Loading pipeline components...:  57%|█████▋    | 4/7 [00:00<00:00, 12.44it/s]Loading pipeline components...:  86%|████████▌ | 6/7 [00:00<00:00,  7.56it/s]Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00,  9.15it/s]
-RETURNED from requests.post on predict at time 	 1731969667.0400865
+Loading pipeline components...:  57%|█████▋    | 4/7 [00:00<00:00, 16.65it/s]Loading pipeline components...:  86%|████████▌ | 6/7 [00:00<00:00, 11.90it/s]Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 10.73it/s]
+:::MLLOG {"key": "error_invalid_config", "value": "Multiple conf files are used. This is not valid for official submission.", "time_ms": 1732142436869.237178, "namespace": "mlperf::logging", "event_type": "POINT_IN_TIME", "metadata": {"is_error": true, "is_warning": false, "file": "test_settings_internal.cc", "line_no": 539, "pid": 30316, "tid": 30316}}
+RETURNED from requests.post on predict at time 	 1732142751.4806168
 BEFORE lg.QuerySamplesComplete(response)
 AFTER lg.QuerySamplesComplete(response)
-RETURNED from requests.post on predict at time 	 1731969689.698808
+RETURNED from requests.post on predict at time 	 1732142752.913671
 BEFORE lg.QuerySamplesComplete(response)
 AFTER lg.QuerySamplesComplete(response)