Skip to content

Add tokens/sec throughput logging and multi-size/TP support to evo2 classifier#1664

Merged
balvisio merged 1 commit into
mainfrom
balvisio/evo2-classifier-throughput-and-tp
Jun 28, 2026
Merged

Add tokens/sec throughput logging and multi-size/TP support to evo2 classifier#1664
balvisio merged 1 commit into
mainfrom
balvisio/evo2-classifier-throughput-and-tp

Conversation

@balvisio

@balvisio balvisio commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Description

Add tokens/sec throughput logging and multi-size/TP support to evo2 classifier

  • Log throughput/tokens_per_sec (+ samples/batches per sec) to W&B/TensorBoard via LoggerConfig.log_throughput_to_tensorboard (--log-token-throughput, --throughput-window-size).
  • Add 7b and 40b sequence-classification providers p
  • Add --tensor-model-parallel-size and --activation-checkpointing CLI flags.

Type of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactor
  • Documentation update
  • Other (please describe):

CI Pipeline Configuration

Configure CI behavior by applying the relevant labels. By default, only basic unit tests are run.

  • ciflow:skip - Skip all CI tests for this PR
  • ciflow:notebooks - Run Jupyter notebooks execution tests
  • ciflow:slow - Run slow single GPU integration tests marked as @pytest.mark.slow
  • ciflow:all - Run all tests, including unit tests, slow tests, notebooks, and every recipe/model directory.

Unit tests marked as @pytest.mark.multi_gpu or @pytest.mark.distributed are not run in the PR pipeline.

For more details, see CONTRIBUTING

Note

By default, only basic unit tests are run. Add appropriate labels to enable an additional test coverage.

Authorizing CI Runs

We use copy-pr-bot to manage authorization of CI
runs on NVIDIA's compute resources.

  • If a pull request is opened by a trusted user and contains only trusted changes, the pull request's code will
    automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123)
  • If a pull request is opened by an untrusted user or contains untrusted changes, an NVIDIA org member must leave an
    /ok to test comment on the pull request to trigger CI. This will need to be done for each new commit.

Triggering Code Rabbit AI Review

To trigger a code review from code rabbit, comment on a pull request with one of these commands:

See https://docs.coderabbit.ai/reference/review-commands for a full list of commands.

Pre-submit Checklist

  • I have tested these changes locally
  • I have updated the documentation accordingly
  • I have added/updated tests as needed
  • All existing tests pass successfully

@copy-pr-bot

copy-pr-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c4f3ed94-f014-4e49-a3fc-51595c35fb1e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch balvisio/evo2-classifier-throughput-and-tp

Comment @coderabbitai help to get the list of available commands.

@balvisio balvisio force-pushed the balvisio/evo2-classifier-throughput-and-tp branch 2 times, most recently from e69e3d9 to 164b97d Compare June 25, 2026 01:06

@jstjohn jstjohn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@balvisio balvisio force-pushed the balvisio/evo2-classifier-throughput-and-tp branch 2 times, most recently from a16561b to f48868e Compare June 28, 2026 18:31
@copy-pr-bot

copy-pr-bot Bot commented Jun 28, 2026

Copy link
Copy Markdown

/ok to test

@balvisio, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

…lassifier

- Log throughput/tokens_per_sec (+ samples/batches per sec) to W&B/TensorBoard
  via LoggerConfig.log_throughput_to_tensorboard (--log-token-throughput,
  --throughput-window-size).
- Add 7b and 40b sequence-classification providers plus a
  CLASSIFIER_PROVIDER_OPTIONS registry, selectable with --model-size.
- Add --tensor-model-parallel-size and --activation-checkpointing CLI flags.
- LoRA tutorial: surface MODEL_SIZE, TENSOR_PARALLEL_SIZE (default 1) and an
  activation-checkpointing toggle, make the global batch data-parallel aware,
  and add a 7b/40b checkpoint download reference. 1b behavior is unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Bruno Alvisio <balvisio@nvidia.com>
@balvisio balvisio force-pushed the balvisio/evo2-classifier-throughput-and-tp branch from f48868e to 87bb2a1 Compare June 28, 2026 21:39
@balvisio

Copy link
Copy Markdown
Collaborator Author

/ok to test 87bb2a1

@balvisio balvisio added this pull request to the merge queue Jun 28, 2026
Merged via the queue into main with commit 04d65c9 Jun 28, 2026
8 checks passed
@balvisio balvisio deleted the balvisio/evo2-classifier-throughput-and-tp branch June 28, 2026 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants