Add tokens/sec throughput logging and multi-size/TP support to evo2 classifier by balvisio · Pull Request #1664 · NVIDIA-BioNeMo/bionemo-recipes

balvisio · 2026-06-25T00:43:33Z

Description

Add tokens/sec throughput logging and multi-size/TP support to evo2 classifier

Log throughput/tokens_per_sec (+ samples/batches per sec) to W&B/TensorBoard via LoggerConfig.log_throughput_to_tensorboard (--log-token-throughput, --throughput-window-size).
Add 7b and 40b sequence-classification providers p
Add --tensor-model-parallel-size and --activation-checkpointing CLI flags.

Type of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Refactor
Documentation update
Other (please describe):

CI Pipeline Configuration

Configure CI behavior by applying the relevant labels. By default, only basic unit tests are run.

ciflow:skip - Skip all CI tests for this PR
ciflow:notebooks - Run Jupyter notebooks execution tests
ciflow:slow - Run slow single GPU integration tests marked as @pytest.mark.slow
ciflow:all - Run all tests, including unit tests, slow tests, notebooks, and every recipe/model directory.

Unit tests marked as @pytest.mark.multi_gpu or @pytest.mark.distributed are not run in the PR pipeline.

For more details, see CONTRIBUTING

Note

By default, only basic unit tests are run. Add appropriate labels to enable an additional test coverage.

Authorizing CI Runs

We use copy-pr-bot to manage authorization of CI
runs on NVIDIA's compute resources.

If a pull request is opened by a trusted user and contains only trusted changes, the pull request's code will
automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123)
If a pull request is opened by an untrusted user or contains untrusted changes, an NVIDIA org member must leave an
/ok to test comment on the pull request to trigger CI. This will need to be done for each new commit.

Triggering Code Rabbit AI Review

To trigger a code review from code rabbit, comment on a pull request with one of these commands:

@coderabbitai review - Triggers a standard review
@coderabbitai full review - Triggers a comprehensive review

See https://docs.coderabbit.ai/reference/review-commands for a full list of commands.

Pre-submit Checklist

I have tested these changes locally
I have updated the documentation accordingly
I have added/updated tests as needed
All existing tests pass successfully

copy-pr-bot · 2026-06-25T00:43:36Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-06-25T00:43:41Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c4f3ed94-f014-4e49-a3fc-51595c35fb1e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch balvisio/evo2-classifier-throughput-and-tp

_{Comment @coderabbitai help to get the list of available commands.}

jstjohn

Thanks!

copy-pr-bot · 2026-06-28T20:08:38Z

/ok to test

@balvisio, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

…lassifier - Log throughput/tokens_per_sec (+ samples/batches per sec) to W&B/TensorBoard via LoggerConfig.log_throughput_to_tensorboard (--log-token-throughput, --throughput-window-size). - Add 7b and 40b sequence-classification providers plus a CLASSIFIER_PROVIDER_OPTIONS registry, selectable with --model-size. - Add --tensor-model-parallel-size and --activation-checkpointing CLI flags. - LoRA tutorial: surface MODEL_SIZE, TENSOR_PARALLEL_SIZE (default 1) and an activation-checkpointing toggle, make the global batch data-parallel aware, and add a 7b/40b checkpoint download reference. 1b behavior is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Bruno Alvisio <balvisio@nvidia.com>

balvisio · 2026-06-28T21:40:12Z

/ok to test 87bb2a1

balvisio requested review from jstjohn, jwilber, pstjohn, savitha-eng and trvachov as code owners June 25, 2026 00:43

balvisio force-pushed the balvisio/evo2-classifier-throughput-and-tp branch 2 times, most recently from e69e3d9 to 164b97d Compare June 25, 2026 01:06

jstjohn approved these changes Jun 27, 2026

View reviewed changes

balvisio force-pushed the balvisio/evo2-classifier-throughput-and-tp branch 2 times, most recently from a16561b to f48868e Compare June 28, 2026 18:31

balvisio force-pushed the balvisio/evo2-classifier-throughput-and-tp branch from f48868e to 87bb2a1 Compare June 28, 2026 21:39

balvisio added this pull request to the merge queue Jun 28, 2026

Merged via the queue into main with commit 04d65c9 Jun 28, 2026
8 checks passed

balvisio deleted the balvisio/evo2-classifier-throughput-and-tp branch June 28, 2026 22:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tokens/sec throughput logging and multi-size/TP support to evo2 classifier#1664

Add tokens/sec throughput logging and multi-size/TP support to evo2 classifier#1664
balvisio merged 1 commit into
mainfrom
balvisio/evo2-classifier-throughput-and-tp

balvisio commented Jun 25, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Jun 25, 2026

Uh oh!

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

Review skipped

Uh oh!

jstjohn left a comment

Uh oh!

copy-pr-bot Bot commented Jun 28, 2026

Uh oh!

balvisio commented Jun 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

balvisio commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of changes

CI Pipeline Configuration

Authorizing CI Runs

Triggering Code Rabbit AI Review

Pre-submit Checklist

Uh oh!

copy-pr-bot Bot commented Jun 25, 2026

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

jstjohn left a comment

Choose a reason for hiding this comment

Uh oh!

copy-pr-bot Bot commented Jun 28, 2026

Uh oh!

balvisio commented Jun 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

balvisio commented Jun 25, 2026 •

edited

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading