Feature/add multitask diffusion transformer policy implementation #2545

brysonjones · 2025-11-29T00:44:56Z

What this does

This PR adds support for an implementation of Multitask Diffusion Transformer Policy, which was shown in a demo of Boston Dynamics' Atlas robot performing whole-body manipulation tasks

I wanted to dive into the research of this method, and build an open-source implementation for the community to leverage and build from.

I will simultaneously be releasing a blog post that includes the details of this work as this gets merged in! (Note: the current blog post link is broken until being released)

How it was tested

I added a simple test script test_multi_task_dit_policy.py to validate any install and import errors
I've tested launching training runs on Modal with batch sizes of 320 on H200 and B200 GPUs
I've trained multiple policies for >60k training steps and >20 hours with this policy, and deployed them onto a Trossen WidowAI Arm (for which I have another PR open for to integrate)

How to checkout & try? (for the reviewer)

Run the test script:

uv run pytest tests/policies/test_multi_task_dit_policy.py -v

Train a policy:

lerobot-train \
  --policy.type=multi_task_dit \
  --dataset.repo_id={{your_dataset_name}} \
  --dataset.root={{your/dataset/path}} \
  --output_dir=outputs/train/multi_task_dit \
  --job_name=multi_task_dit_training_test \
  --policy.device=cuda \
  --batch_size=16 \
  --steps=10000 \
  --save_freq=1000 \
  --wandb.enable=true \
  --policy.repo_id=YOUR_HF_USERNAME/multi_task_dit_policy_test

Add multitask diffusion transformer policy

…vision and text

…and generation quality for DiTs

s1lent4gnt

Thanks for the PR, @brysonjones — nice work! 🙌

Here are my comments from the first review pass:

I think it would be better to keep everything in a single file, modeling_multi_task_dit.py, and remove the modules/ directory for now. It will be easier to maintain.

To match the original LBM paper, we should remove DINOv3 and stick to the CLIP-based setup. However, I’m fine with keeping the flow-matching objective since it’s used in the Boston Dynamics blog post. Maybe we can add a short comment in the code to clarify this difference from the paper.

We can probably simplify things by using a single config for everything instead of multiple configs.

s1lent4gnt

Comments

pyproject.toml

tests/policies/multi_task_dit/test_multi_task_dit.py

src/lerobot/policies/factory.py

src/lerobot/policies/multi_task_dit/processor_multi_task_dit.py

src/lerobot/policies/multi_task_dit/configuration_multi_task_dit.py

src/lerobot/policies/multi_task_dit/modeling_multi_task_dit.py

brysonjones · 2025-12-10T18:48:13Z

Thanks for the PR, @brysonjones — nice work! 🙌

Here are my comments from the first review pass:

I think it would be better to keep everything in a single file, modeling_multi_task_dit.py, and remove the modules/ directory for now. It will be easier to maintain.

To match the original LBM paper, we should remove DINOv3 and stick to the CLIP-based setup. However, I’m fine with keeping the flow-matching objective since it’s used in the Boston Dynamics blog post. Maybe we can add a short comment in the code to clarify this difference from the paper.

We can probably simplify things by using a single config for everything instead of multiple configs.

Hey @s1lent4gnt!

Thanks for the review and feedback. I think all of this is reasonable and makes sense on my side. Will work on on the updates and push them through soon 👍

…emoving inheritance structure

…g, then adding comments to denote where some parameters are only used for specific objectives

brysonjones · 2025-12-11T00:29:17Z

@s1lent4gnt I think all these points should be addressed now, let me know what you think!

…f in the modeling code for multitask dit

brysonjones · 2025-12-11T20:16:24Z

@s1lent4gnt Just worked through moving the tokenization of the task to the pre-processor. I think this is cleaner as well!

s1lent4gnt · 2025-12-12T09:45:29Z

Great work @brysonjones !
LGTM!

brysonjones · 2025-12-15T06:08:01Z

Thank you for the review and help getting this ready, @s1lent4gnt!

Everything is good from my side, let me know if there's any other adjustments to make before merging in 👍

…use transformers lib

brysonjones · 2025-12-15T18:11:22Z

Seems like there were a few missing parts where the tests failed (missing additions to docs toc, transformers lib conditional import)

Have updated and pushed those through

Signed-off-by: Bryson Jones <[email protected]>

src/lerobot/policies/multi_task_dit/modeling_multi_task_dit.py

brysonjones and others added 4 commits November 12, 2025 16:20

Add multitask diffusion transformer policy

14a7a4d

Add multitask diffusion transformer policy

Merge branch 'main' into feature/add-multitask-dit

ab97d5c

expand the observation encoder to support differnt size encoders for …

8b9fada

…vision and text

Merge branch 'main' into feature/add-multitask-dit

34499cb

brysonjones changed the title ~~Feature/add multitask dit~~ Feature/add multitask diffusion transformer policy implementation Nov 29, 2025

pkooij self-requested a review December 9, 2025 07:31

pkooij added the policies Items related to robot policies label Dec 9, 2025

pkooij assigned brysonjones Dec 9, 2025

pkooij removed their request for review December 9, 2025 07:42

brysonjones and others added 3 commits December 9, 2025 07:44

Merge branch 'main' into feature/add-multitask-dit

a0d5a08

add RoPE attention module as this is shown to help training dynamics …

46ebcc2

…and generation quality for DiTs

Merge branch 'main' into feature/add-multitask-dit

22714af

s1lent4gnt reviewed Dec 10, 2025

View reviewed changes

This comment was marked as duplicate.

Sign in to view

s1lent4gnt reviewed Dec 10, 2025

View reviewed changes

brysonjones and others added 13 commits December 10, 2025 14:43

update readme and citations for multitask dit policy

55e19ff

remove dino vision encoder and simplify text and vision encoders by r…

adabb37

…emoving inheritance structure

adjust factory comment

6f85601

update docstring for multitask dit policy processor file

cdacc09

simplify config for multitask dit by merging and flattening everythin…

103230c

…g, then adding comments to denote where some parameters are only used for specific objectives

add references to the modeling file comments

b92dc82

merge all modules files into the main modeling file

3b2a4f5

add torch.no_grad decorators

3a16a00

split up select action return statement

5524a0d

remove redundant asserts

10cfc17

add tutorial to training with multi_task_dit

f1ac454

Merge branch 'main' into feature/add-multitask-dit

d49d339

fix bugs when testing on hardware

ba968e8

refactor code to perform task tokenization in the processor instead o…

71f359c

…f in the modeling code for multitask dit

Merge branch 'main' into feature/add-multitask-dit

e4a1b27

brysonjones and others added 4 commits December 15, 2025 08:22

Merge branch 'main' into feature/add-multitask-dit

a632dd3

Merge branch 'main' into feature/add-multitask-dit

4eda54c

add multitask dit to toc for docs

23382c0

add conditional transformers import to match all other policies that …

534e143

…use transformers lib

brysonjones and others added 7 commits December 15, 2025 18:08

Merge branch 'main' into feature/add-multitask-dit

afe2c4d

add test handling for multitask dit when transformers isnt available

1e049fb

skip tests without transformers

25ecd16

remove cropping of images smaller than the crop size

8a2f5aa

Merge branch 'main' into feature/add-multitask-dit

b575632

Merge branch 'main' into feature/add-multitask-dit

2128dec

Merge branch 'main' into feature/add-multitask-dit

5b9f981

github-actions bot added documentation Improvements or fixes to the project’s docs tests Problems with test coverage, failures, or improvements to testing labels Dec 18, 2025

brysonjones added 7 commits December 18, 2025 08:36

Merge branch 'main' into feature/add-multitask-dit

77dbc95

Merge branch 'main' into feature/add-multitask-dit

632c778

Merge branch 'main' into feature/add-multitask-dit

2b90763

Merge branch 'main' into feature/add-multitask-dit

d653f96

Signed-off-by: Bryson Jones <[email protected]>

Merge branch 'main' into feature/add-multitask-dit

3e5f31e

Signed-off-by: Bryson Jones <[email protected]>

Merge branch 'main' into feature/add-multitask-dit

2a3444a

Merge branch 'main' into feature/add-multitask-dit

e2b47a1

s1lent4gnt reviewed Dec 24, 2025

View reviewed changes

src/lerobot/policies/multi_task_dit/modeling_multi_task_dit.py Outdated Show resolved Hide resolved

brysonjones and others added 3 commits December 25, 2025 04:14

add kwargs arg to multitask dit constructor

f5f9833

Merge branch 'main' into feature/add-multitask-dit

634e392

Merge branch 'main' into feature/add-multitask-dit

8755bd0

Feature/add multitask diffusion transformer policy implementation #2545

Are you sure you want to change the base?

Feature/add multitask diffusion transformer policy implementation #2545

Uh oh!

Conversation

brysonjones commented Nov 29, 2025

What this does

How it was tested

How to checkout & try? (for the reviewer)

Uh oh!

s1lent4gnt left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as duplicate.

Uh oh!

s1lent4gnt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brysonjones commented Dec 10, 2025

Uh oh!

brysonjones commented Dec 11, 2025

Uh oh!

brysonjones commented Dec 11, 2025

Uh oh!

s1lent4gnt commented Dec 12, 2025

Uh oh!

brysonjones commented Dec 15, 2025

Uh oh!

brysonjones commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants