Ensure Flux readiness on blackhole quietbox devices #2104

vpetrovicTT · 2026-02-12T17:46:36Z

Add Flux (FLUX.1-dev, FLUX.1-schnell) and Motif (Motif-Image-6B-Preview) model readiness for Blackhole QuietBox/GE devices
Follow proper BH device types (P150X4, P150X8, P300X2)...

# Release v0.5.0 Co-authored-by: Djordje Madic <[email protected]> Co-authored-by: Zeljana Torlak <[email protected]> Co-authored-by: Filip Ivanovic <[email protected]> Co-authored-by: Lana Jovanovic <[email protected]> Co-authored-by: Igor Djuric <[email protected]> Co-authored-by: Stephen Osborne <[email protected]> Co-authored-by: Adam Roberge <[email protected]> Co-authored-by: Nidhin Jose <[email protected]> Co-authored-by: Marko Jeremic <[email protected]> Co-authored-by: Benjamin Goel <[email protected]> Co-authored-by: Samuel Adesoye <[email protected]> Co-authored-by: Rico Zhu <[email protected]> Co-authored-by: Aleksandar Cvejic <[email protected]> Co-authored-by: Aniruddha Tupe <[email protected]> Co-authored-by: Sam Tisi <[email protected]> Co-authored-by: Pavle Popovic <[email protected]> Co-authored-by: Veljko Maksimovic <[email protected]>

# Conflicts: # README.md # model_specs_output.json

Release v0.7.0

…tal table on separate page (#1775) * update landing page README.md and Model Support generation script, move experimental models to separate page * address #1520 P150x8 not showing up in model support table * truncate docker image tag str in display table to avoid column width being too large * ruff format

Release v0.8.0

…ver (#1757) * Remove USER instruction from build stage for vllm cloud * Fix tt-media-server Dockerfile * Add PYTHON_ENV_DIR to PATH * hardcode path * try new fix * revert * try same fix on media * try with root user * try with same fix on vllm cloud * Initial * Use pip to install uv * Run builder stage as root * add chmod for venv * Copy uv build results to runtime * Optimize two into one layer * try new fix * polish * revert --------- Co-authored-by: Djordje Madic <[email protected]>

* Implement aiperf benchmarking * Fix report generation * Add documentation for aiperf * For aiperf, generate one report with AIPerf Detailed Latency Percentiles and generate another with Throughput Comparison (Stacking on top of current vLLM benchmark we have) * Add proper warm-up for AIPerf, and fix the combined table so that it only includes Text benchmarks * Enable image benchmarking using AIPerf * Fix AIPerf image benchmark parsing to correctly extract image parameters and display targets * Now searches for all 3 benchmark types * Unified tables first, detailed tables second * Clean up benchmark report generation * Enable limit-samples-mode for aiperf and unify output directory for all 3 benchmarks * Add --device and --model argumentsto ensure CLI consistency * Add documentation for changes in benchmarking * Fix linting error * Fix indentation error * Run ruff format to automaticallly fix the formatting * Rename run_benchmarks_aiperf.py -> run_aiperf_benchmarks.py * Fix missing import: use benchmark_generate_report_helper alias * Change report generation to separate tables per tool (text first, then image) * Run ruff format to fix formatting * Change genai-perf -> genai for final report * Add support for text + image benchmarking for GenAI-Perf * Remove duplicate aiperf runner file after rebase * Restore CNN benchmark support and logging in summary_report.py - Add back create_cnn_display_dict() function for CNN display formatting - Restore CNN results processing in generate_report() - Re-add logger import and informational log statements for debugging - Ensures all task types (text, image, audio, embedding, CNN) are fully supported * Restore audio/embedding/CNN support and galaxy_t3k pattern to match dev branch - Restore audio/embedding/CNN benchmark support in run_reports.py * Add back imports for create_audio_display_dict, create_embedding_display_dict, create_cnn_display_dict * Add back audio_sections, embedding_sections, cnn_sections lists * Restore processing for all task types in vLLM, AIPerf, and GenAI-Perf sections * Update section combining to include all task types - Restore galaxy_t3k device pattern in summary_report.py * Add galaxy_t3k back to image_pattern and text_pattern device regex * This was inadvertently removed in commit 2654355e during rebase These changes ensure non-VLM functionality matches dev branch exactly while preserving GenAI-Perf VLM additions. * Use create_image_generation_display_dict for CNN to match dev branch - Restore create_image_generation_display_dict function in summary_report.py - Update run_reports.py to use create_image_generation_display_dict for CNN display - Update section combining comment to match dev branch wording - Ensures CNN display format matches dev branch exactly * Add 20 second wait for the server to start --------- Co-authored-by: Djordje Madic <[email protected]>

…#1773) * try quick fix * use uv pip * change for media * forge fix * changes * fix: use uv pip and run installs as root for permission fixes - Switch from pip to uv pip to match tt-metal commit 29d59d1 - Run all pip installs as root to avoid permission denied errors - Add --index-strategy unsafe-best-match for vllm to find cmake>=3.26.1 - Fix ownership after installs before switching to non-root user * fix: recreate venv symlinks in runtime stage to fix broken Python symlinks * fix: add venv symlink fix to media-server runtime stage * fix: runtime issues - venv pip bootstrap and Media symlinks - workflows/run_workflows.py: Add --upgrade-deps and --clear flags to ensure pip is properly installed in workflow venvs, fixing FileNotFoundError for pip - tt-media-server/Dockerfile: Improve venv symlink fix in runtime stage by also removing pip symlinks and updating activate script VIRTUAL_ENV path * fix: use ensurepip to bootstrap pip in workflow venvs On externally-managed Python (PEP 668), venv may not include pip by default. Use python -m ensurepip --upgrade to ensure pip is available before installing uv. * test milos's changes * revert media changes * remove unnecessary comment * temp change: try without sym links * temp change: try without local share uv * revert * tmp change without symlinks and local share uv * revert * Revert forge and delete unnecessary instructions in vllm * ruff format --------- Co-authored-by: Aleksandar Cvejic <[email protected]>

* Add GenAI-Perf detailed percentiles section to benchmark reports - Created genai_perf_benchmark_generate_report() function parallel to AIPerf - Generates detailed percentile tables (mean, P50, P99) for TTFT, TPOT, E2EL - Supports both text and image benchmarks - Reuses existing aiperf_release_markdown() for consistent formatting - Integrated into main report generation workflow * Fix GenAI-Perf detailed percentiles extraction - Changed from using benchmark_generate_report_helper() to direct JSON processing - Now extracts ISL, OSL, Concurrency from filename (like AIPerf does) - Properly extracts percentile data (median, p99) from JSON - Separates text and image benchmarks by filename pattern - Fixes missing data in GenAI-Perf detailed percentiles tables * Generate detailed percentile reports for GenAI-Perf benchmarks * Add image dimension columns to detailed percentile tables for image benchmarks * Fix images_per_prompt field name to match standard convention * Fix sort key to use images_per_prompt instead of images * Run ruff format on run_reports.py

* try quick fix * use uv pip * change for media * forge fix * changes * fix: use uv pip and run installs as root for permission fixes - Switch from pip to uv pip to match tt-metal commit 29d59d1 - Run all pip installs as root to avoid permission denied errors - Add --index-strategy unsafe-best-match for vllm to find cmake>=3.26.1 - Fix ownership after installs before switching to non-root user * fix: recreate venv symlinks in runtime stage to fix broken Python symlinks * fix: add venv symlink fix to media-server runtime stage * fix: runtime issues - venv pip bootstrap and Media symlinks - workflows/run_workflows.py: Add --upgrade-deps and --clear flags to ensure pip is properly installed in workflow venvs, fixing FileNotFoundError for pip - tt-media-server/Dockerfile: Improve venv symlink fix in runtime stage by also removing pip symlinks and updating activate script VIRTUAL_ENV path * fix: use ensurepip to bootstrap pip in workflow venvs On externally-managed Python (PEP 668), venv may not include pip by default. Use python -m ensurepip --upgrade to ensure pip is available before installing uv. * test milos's changes * revert media changes * remove unnecessary comment * temp change: try without sym links * temp change: try without local share uv * revert * tmp change without symlinks and local share uv * revert * Revert forge and delete unnecessary instructions in vllm * ruff format * resolve merge conflict * local uv share * final change --------- Co-authored-by: Aleksandar Cvejic <[email protected]>

* Check tt liveness instead of waiting 5 seconds * Keep server logs for 1 day * Make test fail * Fix summary when test fails * Remove bottleneck on purpose * Rename const

…_data = data.get("benchmarks: ", data) (#1790)

…1729) * refactor: use ModelSpec JSON for model registration instead of env vars * load ModelSpec JSON once at import time and use impl_id for model registration --------- Co-authored-by: Benjamin Goel <[email protected]>

* add qwen image * format * add qwen-image-2512 * ruff format * ruff format modelspec * use self.settings * cleanup flux model spec

…ls and benchmarks (#1797) * Add DeepSeek-R1-0528 model and eval config (64k) * add default commits from pprajapati/vllm_tracing * adding dual and quad WH Galaxy device types in inference-server for DeepSeek-R1-0528 * fix non-DeepSeek-R1-0528 unintended changes * adding deepseek_r1_galaxy_impl * register TTDeepseekV3ForCausalLM * ruff format --------- Co-authored-by: Mark O'Connor <[email protected]>

…ing name, ID, log file path, and service port. (#1801)

* Add model readiness check before job creation Check if the model is ready before creating a video job. * Check if model is ready before submitting job Add model readiness check before job submission. * Fix indentation for HTTPException raise * Remove model readiness check from fine tuning * Remove model readiness check in video job submission Removed model readiness check before job creation. * Add model readiness check before job creation Check if the model is ready before creating a job. * Check if model is ready before listing jobs * Remove redundant model readiness checks

* Removed uv install since it is part of base metal image * cleanup

* Use vllm bench serve for vLLM http client * Remove TODO about truncate_prompt_tokens * Consolidate older vLLM HTTP and vLLM embeddings venvs * Add BENCHMARKS_VLLM venv type

@bgoelTT

* STD out logs * Refactor build_docker_images to have a separate function for listing sha combinations * Apply suggestion from @bgoelTT Co-authored-by: Benjamin Goel <[email protected]> --------- Co-authored-by: Benjamin Goel <[email protected]>

* feat: add video client * feat: enable video inference running for eval/benchmark * feat: add model spec and perf * feat: update benchmark flow for video generation * test: add unit test for video_client * feat: update test_media_client_factory * fix: update test * test: update test

…pport

github-actions · 2026-02-12T17:49:04Z

✅ Test Coverage Report

Coverage of Changed Lines

Metric	Value
Coverage	100%
Threshold	50%
Status	✅ PASSED

💡 This checks coverage of newly added/modified lines only, not total codebase coverage.

github-actions · 2026-02-12T17:49:19Z

✅ Test Results - PASSED

Summary

Component	Total	Passed	Status
tt-inference-server	392	392	✅
tt-media-server	467	467	✅
Overall	859	859	✅

Details

Python Version: 3.10
Workflow: Test Gate
Commit: 3e942fa
Run ID: 21958214792

🎉 All tests passed! This PR is ready for review.

tstescoTT and others added 30 commits December 16, 2025 20:42

Merge branch 'main' into rc-v0.6.0

87da99f

Release candidate v0.7.0: Update model spec and version

7efb0ca

Fix model_spec_output release issue

6b752df

Merge branch 'dev' into pre-release-v0.7.0

452be1c

# Conflicts: # README.md # model_specs_output.json

Merge pull request #1648 from tenstorrent/rc-v0.7.0

a25bf55

Release v0.7.0

Release candidate v0.8.0: Update model spec and version

ee5f599

fix README.md

8745865

Merge branch 'main' into rc-v0.8.0

a94e1d1

Merge pull request #1804 from tenstorrent/rc-v0.8.0

72d1512

Release v0.8.0

qfix (#1766)

d6243a0

feat: add load and param test (#1768)

7920268

Improve LLM performance test (#1787)

efb573d

* Check tt liveness instead of waiting 5 seconds * Keep server logs for 1 day * Make test fail * Fix summary when test fails * Remove bottleneck on purpose * Rename const

Revert benchmarks_data = data.get("benchmarks: ", None) -> benchmarks…

a01704e

…_data = data.get("benchmarks: ", data) (#1790)

add qwen image (#1750)

360d446

* add qwen image * format * add qwen-image-2512 * ruff format * ruff format modelspec * use self.settings * cleanup flux model spec

Enhance run_docker_server function to return container details includ…

3413b33

…ing name, ID, log file path, and service port. (#1801)

Fix media server build (#1810)

7000fd1

* Removed uv install since it is part of base metal image * cleanup

Ben/vllm bench serve (#1802)

b036309

* Use vllm bench serve for vLLM http client * Remove TODO about truncate_prompt_tokens * Consolidate older vLLM HTTP and vLLM embeddings venvs * Add BENCHMARKS_VLLM venv type

Vmaksimovic/add output to stdout (#1808)

b0127c9

* STD out logs * Refactor build_docker_images to have a separate function for listing sha combinations * Apply suggestion from @bgoelTT Co-authored-by: Benjamin Goel <[email protected]> --------- Co-authored-by: Benjamin Goel <[email protected]>

fix (#1823)

53d0079

fivanovicTT and others added 25 commits February 9, 2026 08:16

feat: increase timeout

2c6b34f

feat: add inference steps; enable eval tu run on custom inference steps

b006d31

Merge branch 'dev' into vpetrovic/feature/2006-flux-motif-eval-test

4327838

fix(flux/motif): Add and format test configs

9148a85

fix(flux/motif): Refactor client to call the desired eval test

feedfb3

fix: Fix Ruff errs

b808abb

fix(motif/flux): Fix mismatched files

809fcde

fix(motif/flux): Use default config class for seting up the test

d8c8455

fix(motif/flux): Make the test async

6c83320

fix(morif/flux): Add tests for client

bbf0457

Merge branch 'dev' into vpetrovic/feature/2006-flux-motif-eval-test

622ce1a

fix(flux): Fix timeout duration

666d4d4

feat(flux/motif): Merge latest dev

b381e8e

fix(flux/motif): fix timeout duration during single image gen

555712f

fix(flux/motif): Fix server test config weights name

cdcc903

fix(flux/motif): Fix timeout durration

db2ef30

fix(flux/motif): Fix server config

708812f

fix(flux/motif): Final config changes

e81d186

fix(motif/flux): Fix livenes workers for t3k

c781e62

fix(motif/flux): Update payload for eval test

4ece9bf

fix(flux/motif): Minifix in eval test

38a9d08

fix(flux/motif): Add deatiled logging

f16243b

fix(flux): Add valid name conf for QB devices

d891021

fix(flux): Rename qb devices

e2ea698

fix(config): Align BH device naming with fivanovic/video-models-bh-su…

3e97253

…pport

vpetrovicTT changed the base branch from main to dev February 12, 2026 17:58

Merge branch 'dev' into vpetrovic/feature/747-flux-bh-qbge-readiness-v2

7fb7454

vpetrovicTT linked an issue Feb 12, 2026 that may be closed by this pull request

[Model Readiness Support]: Flux on BH QB GE #747

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure Flux readiness on blackhole quietbox devices #2104

Ensure Flux readiness on blackhole quietbox devices #2104

Uh oh!

vpetrovicTT commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Ensure Flux readiness on blackhole quietbox devices #2104

Are you sure you want to change the base?

Ensure Flux readiness on blackhole quietbox devices #2104

Uh oh!

Conversation

vpetrovicTT commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Test Coverage Report

Coverage of Changed Lines

Uh oh!

github-actions bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Test Results - PASSED

Summary

Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

github-actions bot commented Feb 12, 2026 •

edited

Loading

github-actions bot commented Feb 12, 2026 •

edited

Loading