-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start to cleanup/unify accelerate and common back-ends (Part 1/N) #1777
Conversation
64ff27d
to
b65d908
Compare
b65d908
to
7689ab8
Compare
062ef77
to
a990d4b
Compare
25206a1
to
a0de7d4
Compare
4512bd8
to
32ac258
Compare
e58969a
to
cc74427
Compare
cc74427
to
681dc8a
Compare
Sorry for the massive diff. I'm going to stop adding to this one as I think it's mergable. I will share some benchmarks shortly. There is still some work to do to remove the rest of accelerate but it's nearly there. After that, I think it would be good to split the "CPU" part of common out into a CPU back-end (which mirrors |
bda8d7f
to
e7c8351
Compare
QMM benchmarks, M3 Max
|
Unary ops M3 Max TLDR
|
Binary op on M3 Max Observations:
|
2530efd
to
9829c12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me!!! Fantastic job.
Can't wait to add this to the CPU compile as well. It's gonna be beautiful.
9829c12
to
b38b394
Compare
constexpr std::array<uint32_t, 8> shifts_ = {{0, 8, 16, 24, 0, 8, 16, 24}}; | ||
auto shifts(*(simd::Simd<uint32_t, S>*)&shifts_); | ||
auto l = simd::Simd<uint32_t, 4>(*w++); | ||
auto r = simd::Simd<uint32_t, 4>(*w); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for posting post-merge, but the following is easier to describe in the context of this PR:
Clang on the latest FreeBSD, 13.4, and 14.2, is unhappy with the sims::Simd
calls in L167-L168: error: implicit instantiation of undefined template 'mlx::core::simd::Simd<unsigned int, 4>
, cf.
https://buildkite.com/julialang/yggdrasil/builds/17042#0194c6a1-5b8d-4a7c-8cef-8e098d29b750/6-32274
Not sure how to resolve this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to file an issue. That way we won't forget about this and we can help debug it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Begin to cleanup accelerate and common back-ends. This is just a small step and not the final API by any means, but I think it is a step in the right direction.
mlx::core::simdSimd<T, N>
generic type and many free functionseval
with primitiveseval_cpu
in bothaccelerate
andcommon
TODOs for this PR:
TODOs for future PRs
Simd<T, N>