-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add adam optimizer benchmark #1764
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @khushi-411, thanks for contributing! Unfortunately your implementation does not benchmark the Adam optimizer through Thunder.
To trace the optimizer step we need to provide it in a functional form to Thunder. So I think what @crcrpar intended for #1213 issue to be a benchmark for the following function: https://github.com/pytorch/pytorch/blob/b0042286d48e2d202019d3defd3b53086efb1e6e/torch/optim/adam.py#L866
for more information, see https://pre-commit.ci
Hi @riccardofelluga! Thank you for reviewing the PR and for your suggestions. I've made the updates, please take another look whenever you have time. :-)
|
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far so good! Big improvement from last time, tho there are still a couple of things to address
Side question: why is this named after litgpt? In the end the function you are benchmarking comes from torch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! We are getting to a good shape by now, just a couple of nits and the requires_grad
situation to sort out before crossing the finish line.
Is it really useful to benchmark the backward function of the optimizer step?
for more information, see https://pre-commit.ci
Thank you, @riccardofelluga for all your useful suggestions!
No, I don't think so, because even if we calculate the gradient of the backward pass, it wouldn't be useful (at least in general cases like this). One reason I thought to explicitly declare |
Indeed! I think the best solution here would be to parametrize only for
That error comes from using the decorator |
Before submitting
What does this PR do?
Fixes a part of #1213
Hi Team! This PR adds benchmarking support for Adam optimizer's benchmark in Thunder for both training and inference.
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Yes Indeed 🎉
Benchmarking Results
Command to run