-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Feature/add multitask diffusion transformer policy implementation #2545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/add multitask diffusion transformer policy implementation #2545
Conversation
Add multitask diffusion transformer policy
s1lent4gnt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, @brysonjones — nice work! 🙌
Here are my comments from the first review pass:
I think it would be better to keep everything in a single file, modeling_multi_task_dit.py, and remove the modules/ directory for now. It will be easier to maintain.
To match the original LBM paper, we should remove DINOv3 and stick to the CLIP-based setup. However, I’m fine with keeping the flow-matching objective since it’s used in the Boston Dynamics blog post. Maybe we can add a short comment in the code to clarify this difference from the paper.
We can probably simplify things by using a single config for everything instead of multiple configs.
s1lent4gnt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments
src/lerobot/policies/multi_task_dit/processor_multi_task_dit.py
Outdated
Show resolved
Hide resolved
Hey @s1lent4gnt! Thanks for the review and feedback. I think all of this is reasonable and makes sense on my side. Will work on on the updates and push them through soon 👍 |
…emoving inheritance structure
…g, then adding comments to denote where some parameters are only used for specific objectives
|
@s1lent4gnt I think all these points should be addressed now, let me know what you think! |
…f in the modeling code for multitask dit
|
@s1lent4gnt Just worked through moving the tokenization of the task to the pre-processor. I think this is cleaner as well! |
|
Great work @brysonjones ! |
|
Thank you for the review and help getting this ready, @s1lent4gnt! Everything is good from my side, let me know if there's any other adjustments to make before merging in 👍 |
|
Seems like there were a few missing parts where the tests failed (missing additions to docs toc, transformers lib conditional import) Have updated and pushed those through |
Signed-off-by: Bryson Jones <[email protected]>
Signed-off-by: Bryson Jones <[email protected]>
What this does
This PR adds support for an implementation of Multitask Diffusion Transformer Policy, which was shown in a demo of Boston Dynamics' Atlas robot performing whole-body manipulation tasks
I wanted to dive into the research of this method, and build an open-source implementation for the community to leverage and build from.
I will simultaneously be releasing a blog post that includes the details of this work as this gets merged in! (Note: the current blog post link is broken until being released)
How it was tested
test_multi_task_dit_policy.pyto validate any install and import errorsHow to checkout & try? (for the reviewer)
Run the test script:
Train a policy:
lerobot-train \ --policy.type=multi_task_dit \ --dataset.repo_id={{your_dataset_name}} \ --dataset.root={{your/dataset/path}} \ --output_dir=outputs/train/multi_task_dit \ --job_name=multi_task_dit_training_test \ --policy.device=cuda \ --batch_size=16 \ --steps=10000 \ --save_freq=1000 \ --wandb.enable=true \ --policy.repo_id=YOUR_HF_USERNAME/multi_task_dit_policy_test