-
Notifications
You must be signed in to change notification settings - Fork 383
[DeepSeek] remove numpy, avoid tolist in gatherd_idxs #1019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a453229
to
1f1b16d
Compare
Thanks much for the PR! Indeed would look much nicer! Do you mind testing your PR with I am seeing this result:
|
Hmm, weird, yes I'm seeing the same. I hadn't tested Let me look into it and I'll ping you when I figure it out. |
Ah, it was just the
Apologies for that; should have tested more carefully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thank you for the nice code!
@garrett361 Nice add! Do you notice any speed boost from removing cpu sync? |
Thanks @EugenHotaj ! Good question, but I didn't time it and the |
Removes the
numpy
usage andtolist
CUDA sync when computinggatherd_idxs
.