-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compiled autograd tutorial #3026
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3026
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 14c7499 with merge base 19fffda ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
b86d2c5
to
ffa0a81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for following the template - I really appreciate it! I added some editorial / formatting suggestions. Can you please take a look and let me know if you have any questions.
|
||
###################################################################### | ||
# Compiled Autograd addresses certain limitations of AOTAutograd | ||
# ------------ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a title or just a paragraph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paragraph to clarify that it's a new section after "Compiling the forward and backward pass using different flags"
4b696bb
to
a9d3bb3
Compare
a9d3bb3
to
271b8f2
Compare
There is still some issue on line 54 that prevents the tutorial from building. Can you check? |
* Added runtime overhead at the start of the backward for cache lookup | ||
* More prone to recompiles and graph breaks in dynamo due to the larger capture | ||
|
||
.. note:: Compiled Autograd is under active development and is not yet compatible with all existing PyTorch features. For the latest status on a particular feature, refer to `Compiled Autograd Landing Page <https://docs.google.com/document/d/11VucFBEewzqgkABIjebZIzMvrXr3BtcY1aGKpX61pJY>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why it's called Landing Page. Should it be called a Compiled Autograd Roadmap? Also, it feels like it would be better to point to issues on github with a specific label, perhaps, or a Github Project even?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm expecting the content of this doc to change rapidly at the moment, so I'll keep it as a google doc for now. It's also something we did for dynamic shapes and custom operators.
The end goal once the development is more stable, is to move it into the docs
Hi @svekars, how can I merge the PR? |
Hi, I am concerned about CompiledAutograd's progress. After a series of DDP-related optimization:
Does the model have compute-computation overlap when enabled CompiledAutograd? Is there any latest progress related to DDP for CompiledAutograd? |
@yitingw1 When enable CompiledAutograd, we should also enable the new CompiledDDP. Right now it is not automatically enabled. As for the overlapping, the answer is yes if the new CompiledDDP is enabled. The overlapping strategy is the same as the DDPOptimizer and the eager DDP. The difference from the DDPOptimizer is that the new CompiledDDP, with the help from CompiledAutograd, should produce less if not zero graph breaks. However, whether it will perform better or not depending on the models. |
def train(model, x): | ||
model = torch.compile(model) | ||
loss = model(x).sum() | ||
with torch._dynamo.compiled_autograd.enable(torch.compile(fullgraph=True)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to ask user to wrap both the forward and backward inside torch._dynamo.compiled_autograd.enable
as both DDP and FSDP require users to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this is the 2nd "more flexible" API
@svekars can we do that in a separate PR? we don't even have a compile section right now. and I'd like to add in the torch.compile tutorial as well as the TORCH_LOGS one |
New description looks good to me! |
Fixes #3034
Description
Add tutorial for a PyTorch 2.4 feature
Checklist