Releases: mlcommons/inference
Releases · mlcommons/inference
Inference v5.1.1
What's Changed
- Fix small llm readme by @pgmpablo157321 in #2220
- [DLRMv2] Updating DLRMv2 dataset size by @keithachorn-intel in #2222
- Remove incorrect automation commands by @anandhu-eng in #2218
- Fix Typo in Interactive Latencies by @mrmhodak in #2147
- Host speech2text artifacts in MLC storage bucket by @pgmpablo157321 in #2223
- Add missing interactive configurations by @pgmpablo157321 in #2224
- Update compliance table by @pgmpablo157321 in #2243
- Add deepseek dataset sources by @pgmpablo157321 in #2242
- Adding mlperf.conf setting for Whisper by @keithachorn-intel in #2238
- Fix Docs by @arjunsuresh in #2229
- Incorrect Regex for RougeLSum by @hvagadia in #2230
- update eval_accuracy.py and deepseek thresholds by @viraatc in #2233
- Add llama3.1-8b-edge as a separated benchmark by @pgmpablo157321 in #2231
- [Whisper] accuracy threshold by @wu6u3tw in #2259
- [Whisper] fix regex part by @wu6u3tw in #2260
- Update download path for llama3.1_8b dataset by @pgmpablo157321 in #2261
- Add FAQ addressing Whisper input padding. by @keithachorn-intel in #2255
- Update version generate_final_report.py by @pgmpablo157321 in #2269
- Add interactive scenario in the TEST06, bump loadgen version to 5.1 by @nvzhihanj in #2272
- Pinning vllm for speech-to-text reference by @keithachorn-intel in #2273
- Fix SingleStream llama3.1-8b typo by @pgmpablo157321 in #2274
- Update download path for DeepSeek-R1 Dataset by @pgmpablo157321 in #2275
- Update documentation by @anandhu-eng in #2279
- Address issue that logger.info not captured by stdout; remove redundant logging by @nvzhihanj in #2278
- Bugfix: Remove TEST01 for interactive scenario; add TEST06 for them by @nvzhihanj in #2281
- Fix list of workloads (LLMs) not requiring compliance TEST01 by @psyhtest in #2283
- Bugfix: Fix ds-r1 acc checker output format not captured by submission checker by @nvzhihanj in #2285
- Fixing 5.1 Submission Date by @mrmhodak in #2288
- fix: update ds-r1 truncate max-output-len to 20k (was 32k) by @viraatc in #2290
- Fix llama3.1-8b edge metrics and datasets by @pgmpablo157321 in #2300
- Add interactive scenario to final report to llama3.1 models by @pgmpablo157321 in #2299
- Allow more flexible datatypes in measurements file by @pgmpablo157321 in #2298
- Update evaluation.py by @taran2210 in #2303
- Only require Server or Interactive for closed by @pgmpablo157321 in #2304
- [Whisper] Adding n_token return for compliance fix by @keithachorn-intel in #2305
- Fix checking power directory by @anandhu-eng in #2306
- Only check for token latency requirements for server scenario by @pgmpablo157321 in #2313
- Use server SUT for SingleStream by @pgmpablo157321 in #2314
- Update the default value for repository arg by @anandhu-eng in #2317
- Update preprocess_submission.py | Skip inferring offline scenario if … by @arjunsuresh in #2316
- Fix: add llama3.1-8b-edge to generate_final_report by @pgmpablo157321 in #2319
- Allow lowercase 'interactive' as scenario name by @psyhtest in #2315
- Use sample latency as the metric for llama3.1_8b_edge SingleStream by @pgmpablo157321 in #2324
- Remove rclone references and update download instructions for DeepSeek-R1, Llama 3.1 8b, and Whisper by @anivar in #2289
- Hide long time untested implementations from docs by @anandhu-eng in #2328
- Initial draft for SCC 25 documentation by @anandhu-eng in #2331
- fix for fstring by @anandhu-eng in #2332
- Updation of automation run commands - v5.1_dev by @anandhu-eng in #2333
- Fixes for docs by @anandhu-eng in #2334
- Update submission_checker.py | Fixes #2325 by @arjunsuresh in #2326
- Fixes for scc doc by @anandhu-eng in #2339
- Update link to AMD readme for SCC by @anandhu-eng in #2340
- Add additional information for dataset and model downloads by @anandhu-eng in #2343
- Automation Docs: Provide correct syntax for NVIDIA batch size by @anandhu-eng in #2335
- Fix lookup of required accuracy delta by @psyhtest in #2337
- Llama3.1-405b: Add commands for model/dataset download using R2-downl… by @anandhu-eng in #2351
- Update commands to include R2-downloader for RGAT model download by @anandhu-eng in #2350
- Update minimum disk space required for PointPainting by @anandhu-eng in #2353
- Automation Docs: Add information about launching docker in privileged… by @anandhu-eng in #2352
- Remove Nvidia folder from compliance tree by @pgmpablo157321 in #2354
- Provide help - insufficient max locked memory error for Nvidia runs by @anandhu-eng in #2355
- fix: correct top-p and min-output-len for llama3.1-405b reference implementation by @viraatc in #2349
- Fix minor typos in reference_mlperf_perf.sh and reference_mlperf_accuracy.sh by @naveenmiriyaluredhat in #2327
- [LoadGen]
Time to Output Token->Time per Output Tokenby @wangshangsam in #2360 - [Whisper] Add labels' in the whisper output by @wu6u3tw in #2252
- Update gptj model download command to support R2 downloader by @anandhu-eng in #2368
- Update MLCFlow model and dataset download commands to support R2 downloader by @anandhu-eng in #2369
- Discard duplicate information about external model download by @anandhu-eng in #2370
- MIXTRAL - Update MLCFlow dataset and model download commands to suppo… by @anandhu-eng in #2371
- Update DLRM v2 assets download commands to support R2 downloader by @anandhu-eng in #2372
- SDXL: Update MLC commands to support model download through R2 by @anandhu-eng in #2373
- Updation for inference docs page: migration to R2 by @anandhu-eng in #2374
- Update MLC commands to support downloads through R2 by @anandhu-eng in #2367
- Replaced shell commands with Python for Windows compliance script compatibility by @sujik18 in #2344
New Contributors
- @hvagadia made their first contribution in #2230
- @wu6u3tw made their first contribution in #2259
- @taran2210 made their first contribution in #2303
- @anivar made their first contribution in #2289
- @naveenmiriyaluredhat made their first contribution in #2327
- @wangshangsam made their first contribution in #2360
- @sujik18 made their first contribution in #2344
Full Changelog: v5.1...v5.1.1
Inference v5.1
What's Changed
- Mixtral fix: match reference with standalone script by @pgmpablo157321 in #2054
- Add automotive submission checker by @pgmpablo157321 in #2051
- Edit mlperf.conf for pointpainting - automotive by @anandhu-eng in #2052
- Automotive benchmark table by @pgmpablo157321 in #2050
- Update backend_pytorch_native.py | Fixes #2056 by @arjunsuresh in #2057
- Update truncate_accuracy_log.py | Remove a wrong ERROR message in logs by @arjunsuresh in #2061
- Changes for final report generation - PointPainting by @anandhu-eng in #2063
- Fix mlperf.conf link for equal issue mode by @anandhu-eng in #2069
- Update results.cc | Add another significant digit to percentile laten… by @arjunsuresh in #2066
- Create benchmark checklist for pointpainting by @anandhu-eng in #2068
- Fix np.memmap usage, add flag to force not using memmap by @nv-alicheng in #2081
- Add device map - pointpainting automotive by @anandhu-eng in #2087
- PointPainting Documentation update by @rod409 in #2089
- Update verify_performance.py | Support 99.9 percentile latency in TEST01 for pointpainting by @arjunsuresh in #2071
- Match server scenario to standalone implementation by @pgmpablo157321 in #2086
- Add reference model details by @anandhu-eng in #2084
- [405B] Set max_tokens to 2k by @attafosu in #2088
- Add parameter number and FLOPs values by @anandhu-eng in #2090
- Update benchmark-checklist.md - PointPainting by @anandhu-eng in #2083
- Report improvement - support output of IDs to a json file by @arjunsuresh in #2059
- Fixed GPTJ accuracy checker by @nvzhihanj in #2093
- Fix SDXL, Retinanet and GPTJ accuracy checker by @nvzhihanj in #2094
- Update auto-update-dev.yml | update docs as well by @arjunsuresh in #2096
- Update submission_checker.py | Prevent empty accuracy in open division by @arjunsuresh in #2097
- Added information about GitHub tests currently live by @anandhu-eng in #2091
- Docs update, fix download links for llama models by @arjunsuresh in #2055
- Add MLC Automation commands by @anandhu-eng in #2115
- Remove llama3.1 user conf unnecessary and misleading lines by @pgmpablo157321 in #2114
- Update docs by @arjunsuresh in #2118
- Update verify_performance.py | Fix compliance test for extra percenti… by @arjunsuresh in #2120
- Update default version of final report script by @pgmpablo157321 in #2124
- Update accuracy_igbh.py | Fixes 2119 by @arjunsuresh in #2123
- 🔄 synced file(s) with mlcommons/power-dev by @mlcommons-bot in #2125
- Final report cosmetic fix by @pgmpablo157321 in #2134
- Update loadgen package name in classification_and_detection setup by @annietllnd in #2131
- Add exception for github-actions[bot] to cla.yml by @nathanwasson in #2135
- Final report cosmetic fix by @pgmpablo157321 in #2141
- Update submission_checker.py | Fix open model unit in Results by @arjunsuresh in #2144
- Add Llama 3.1 to special unit dict by @pgmpablo157321 in #2150
- [Post Mortem] Log number of errors in detail log by @pgmpablo157321 in #2164
- Docs update by @nathanwasson in #2137
- [Post Mortem] Check all systems and measurements folders have results by @pgmpablo157321 in #2166
- [Post Mortem] Add calibration check to submission checker by @pgmpablo157321 in #2185
- Add find peak performance documentation by @pgmpablo157321 in #2186
- [Post Mortem] Check equal issue for open division + check accuracy run covers all the dataset by @pgmpablo157321 in #2170
- add deepseek-r1 multi-backend reference implementation by @viraatc in #2198
- fix: update sglang docker for deepseek-r1 by @viraatc in #2201
- fix: update eval_accuracy to handle mlperf_log_accuracy.json by @viraatc in #2202
- Use existing submission generation workflow from mlperf-automations repo by @anandhu-eng in #2199
- Add deepseek configuration + v5.1 Readme by @pgmpablo157321 in #2203
- Updated readme with mlc commands for model,dataset,accuracy and submission generation by @anandhu-eng in #2143
- Docs - Update disk space for reference implementation by @anandhu-eng in #2159
- Update Waymo access instructions by @nathanwasson in #2148
- Update verify_performance.py | Refactor the code by @arjunsuresh in #2073
- Update mlperf.conf with final deepseek configuration by @pgmpablo157321 in #2208
- Add whisper reference implementation by @pgmpablo157321 in #2193
- Llama3.1-8b reference implementation by @pgmpablo157321 in #2190
- Skip Imagenet calibration dataset download in GH actions by @anandhu-eng in #2209
- Add v5.1 submission checker by @pgmpablo157321 in #2204
- Update Llama 3.1 model access instructions by @nathanwasson in #2149
- Rename audit.conf to audit.config by @arjunsuresh in #2127
- Quick fix: correct metrics by @pgmpablo157321 in #2211
- [Whisper] Updating dataset for repacked dev-all. by @keithachorn-intel in #2212
- Partial fix for compliance TEST01 update by @keithachorn-intel in #2215
- Completion of compliance TEST01 fix by @keithachorn-intel in #2217
- Fix typo in automation command by @anandhu-eng in #2219
- Fix Readme for inference v5.1 by @pgmpablo157321 in #2216
New Contributors
- @annietllnd made their first contribution in #2131
Full Changelog: v5.0.1...v5.1
Inference v5.0
Merge automotive reference into main branch (#2047) * Automotive reference implementation sketch * WIP: automotive reference implementation * WIP add segmentation, dataset, and util functions to reference * WIP: reference implementation with issues during post processing * WIP update dockerfile remove pdb breaks * WIP: reference implementation that runs samples * Update README.md with initial docker runs * WIP: add accuracy checker to reference * Fix: set lidar detector to evaluation mode * [Automated Commit] Format Codebase * Update README.md * Remove unnecessary mlperf conf load and implement dataset get_item_count * add cpu only version of reference * update dockerfile and display overal mAP * code cleanup * Update README.md * fix packages versions in dockerfile * handle frames if model predicts no objects * Allow accuracy check on subset of datasets * [Automated Commit] Format Codebase * Updates automotive reference implementation (#2045) * Remove unnecessary mlperf conf load and implement dataset get_item_count * add cpu only version of reference * update dockerfile and display overal mAP * code cleanup * Update README.md * fix packages versions in dockerfile * handle frames if model predicts no objects * Allow accuracy check on subset of datasets * [Automated Commit] Format Codebase --------- Co-authored-by: mlcommons-bot <[email protected]> * [Automated Commit] Format Codebase * fix whitespace merge errors * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * add removed code during merge * [Automated Commit] Format Codebase * change nms thresholds on cpu only code --------- Co-authored-by: Pablo Gonzalez <[email protected]> Co-authored-by: Arjun Suresh <[email protected]> Co-authored-by: arjunsuresh <[email protected]> Co-authored-by: Miro <[email protected]> Co-authored-by: mlcommons-bot <[email protected]>
Inference v5.0
Update format.yml to us GITHUB_TOKEN & GitHub Actions bot (#2044)
Inference v4.1
Fix delta perf check (#1804)
Inference v4.0
Fix typo in README.md
v3.1 Inference
Fix code hyperlink text
v3.0 Inference
Includes the latest changes for inference 3.0
v2.1 Inference
Inference 2.1
- Includes annotations generated by
openimages.pyv2.1 script
v2.0 Inference
v2.0 Inference