-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Add x86_64
archspec support for Windows
#3803
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Julien Jerphanion <[email protected]>
Hi Mamba folks, is there any particular testing that would be helpful to get this PR over the line? Or is it just waiting for someone to review it |
@intentionally-left-nil: we need someone to test and review it, ideally on several Windows machines with different architectures. We welcome your feedback greatly! 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM although having feedback from people who can test it would be nice.
The nitpicking comments can be ignored if you prefer to keep the exhaustive check list for each version of x86_64.
bool avx512cd = cpu_info[1] & (1 << 28); | ||
bool avx512bw = cpu_info[1] & (1 << 30); | ||
bool avx512vl = cpu_info[1] & (1 << 31); | ||
if (avx512f && avx512dq && avx512cd && avx512bw && avx512vl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking: avx512 dq/cd/bw/vl are all implemented on top of avx512f, so no need to check for this value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I propose to keep the logic in line with the one for unix system as currently implemented a few lines above because it is cheap and it does not make assumption (also, I would take time to make sure that the above is correctly implemented, because more instructions seem to be present for each architecture).
bool bmi2 = cpu_info[1] & (1 << 8); | ||
bool fma_ = cpu_info[1] & (1 << 12); | ||
bool avx = cpu_info[1] & (1 << 28); | ||
if (bmi && avx2 && bmi2 && fma_ && avx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar remark here: avx2 is implmeented on top of avx, so no need to check for avx.
bool sse41 = cpu_info[1] & (1 << 19); | ||
bool sse42 = cpu_info[1] & (1 << 20); | ||
bool popcnt = cpu_info[1] & (1 << 23); | ||
if (sse3 && ssse3 && sse41 && sse42 && popcnt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar remark here: sse42 implies sse41 which implies ssse3 which implies sse3,so checking sse42 should be enough (I haven't checked for popcnt).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark here, the support for all these instructions should be read from cpu_info[2] after call to cpuid with eax=1.
Signed-off-by: Julien Jerphanion <[email protected]> Co-authored-by: Johan Mabille <[email protected]>
I tested on my Windows machine, it has an AMD Ryzen 9 5900X , running Windows 11. I built this branch locally using Visual Studio (preview)
I was told |
|
Perhaps I compiled something wrong but my results on Windows 11 with an i9-13950HX:
Compiler: |
@nbeerbower I think you're testing the wrong binary, the output version is 2.0.6.rc2, while it should be 2.0.6. |
if (cpu_info[0] < 7) | ||
{ | ||
return "x86_64"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of this, eax>7 is required to get supported instructions for v4, but instructions for lower versions (such as v3) are checked with a lower value of eax (see my other comment below).
bool avx2 = cpu_info[1] & (1 << 5); | ||
bool bmi2 = cpu_info[1] & (1 << 8); | ||
bool fma_ = cpu_info[1] & (1 << 12); | ||
bool avx = cpu_info[1] & (1 << 28); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The support for fma and avx should be checked after a call to cpuid with eax = 1, and read from ecx (i.e. cpu_info[2]), see https://en.wikipedia.org/wiki/CPUID#EAX=1:_Processor_Info_and_Feature_Bits
Sorry for not catching it during the first review.
bool sse41 = cpu_info[1] & (1 << 19); | ||
bool sse42 = cpu_info[1] & (1 << 20); | ||
bool popcnt = cpu_info[1] & (1 << 23); | ||
if (sse3 && ssse3 && sse41 && sse42 && popcnt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark here, the support for all these instructions should be read from cpu_info[2] after call to cpuid with eax=1.
Signed-off-by: Julien Jerphanion <[email protected]> Co-authored-by: Johan Mabille <[email protected]>
Signed-off-by: Julien Jerphanion <[email protected]>
Fix #3795.