Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add x86_64 archspec support for Windows #3803

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jjerphan
Copy link
Member

@jjerphan jjerphan commented Feb 5, 2025

Fix #3795.

@jjerphan jjerphan added the release::bug_fixes For PRs fixing bugs label Feb 5, 2025
@jjerphan jjerphan requested a review from Klaim February 5, 2025 12:37
@jjerphan jjerphan marked this pull request as ready for review February 5, 2025 12:37
@intentionally-left-nil
Copy link

Hi Mamba folks, is there any particular testing that would be helpful to get this PR over the line? Or is it just waiting for someone to review it

@jjerphan
Copy link
Member Author

@intentionally-left-nil: we need someone to test and review it, ideally on several Windows machines with different architectures.

We welcome your feedback greatly! 🙂

Copy link
Member

@JohanMabille JohanMabille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM although having feedback from people who can test it would be nice.

The nitpicking comments can be ignored if you prefer to keep the exhaustive check list for each version of x86_64.

libmamba/src/core/virtual_packages.cpp Show resolved Hide resolved
libmamba/src/core/virtual_packages.cpp Outdated Show resolved Hide resolved
bool avx512cd = cpu_info[1] & (1 << 28);
bool avx512bw = cpu_info[1] & (1 << 30);
bool avx512vl = cpu_info[1] & (1 << 31);
if (avx512f && avx512dq && avx512cd && avx512bw && avx512vl)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking: avx512 dq/cd/bw/vl are all implemented on top of avx512f, so no need to check for this value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I propose to keep the logic in line with the one for unix system as currently implemented a few lines above because it is cheap and it does not make assumption (also, I would take time to make sure that the above is correctly implemented, because more instructions seem to be present for each architecture).

bool bmi2 = cpu_info[1] & (1 << 8);
bool fma_ = cpu_info[1] & (1 << 12);
bool avx = cpu_info[1] & (1 << 28);
if (bmi && avx2 && bmi2 && fma_ && avx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar remark here: avx2 is implmeented on top of avx, so no need to check for avx.

bool sse41 = cpu_info[1] & (1 << 19);
bool sse42 = cpu_info[1] & (1 << 20);
bool popcnt = cpu_info[1] & (1 << 23);
if (sse3 && ssse3 && sse41 && sse42 && popcnt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar remark here: sse42 implies sse41 which implies ssse3 which implies sse3,so checking sse42 should be enough (I haven't checked for popcnt).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark here, the support for all these instructions should be read from cpu_info[2] after call to cpuid with eax=1.

Signed-off-by: Julien Jerphanion <[email protected]>

Co-authored-by: Johan Mabille <[email protected]>
@Klaim
Copy link
Member

Klaim commented Feb 13, 2025

I tested on my Windows machine, it has an AMD Ryzen 9 5900X , running Windows 11.

I built this branch locally using Visual Studio (preview) v17.13.0-pre.2.1, compiler version 19.43.34618 for x64.
I obtained:

mamba info -v

       libmamba version : 2.0.6
          mamba version : 2.0.6
           curl version : libcurl/8.11.1 Schannel zlib/1.3.1 libssh2/1.11.1
     libarchive version : libarchive 3.7.7 zlib/1.3.1 liblzma/5.6.3 bz2lib/1.0.8 liblz4/1.10.0 libzstd/1.5.6
       envs directories : C:\Users\Klaim\micromamba\envs
          package cache : C:\Users\Klaim\micromamba\pkgs
                          C:\Users\Klaim\.mamba\pkgs
                          C:\Users\Klaim\AppData\Roaming\.mamba\pkgs
            environment : mamba-pr-testing (active)
           env location : C:\Users\Klaim\micromamba\envs\mamba-pr-testing
      user config files : C:\Users\Klaim\.mambarc
 populated config files :
       virtual packages : __win=10.0.22631=0
                          __archspec=1=x86_64_v2
                          __cuda=12.7=0
               channels : https://conda.anaconda.org/conda-forge/noarch
                          https://conda.anaconda.org/conda-forge/win-64
       base environment : C:\Users\Klaim\micromamba
               platform : win-64

I was told __archspec should be v3 for this case? (I have no idea what is the correct value for this cpu)

@jjerphan
Copy link
Member Author

AMD Ryzen 9 5900X qualifies for x86_64_v3.

@nbeerbower
Copy link

Perhaps I compiled something wrong but my results on Windows 11 with an i9-13950HX:


       libmamba version : 2.0.6.rc2
          mamba version : 2.0.6.rc2
           curl version : libcurl/8.11.1 Schannel zlib/1.3.1 libssh2/1.11.1
     libarchive version : libarchive 3.7.7 zlib/1.3.1 liblzma/5.6.3 bz2lib/1.0.8 liblz4/1.10.0 libzstd/1.5.6
       envs directories : C:\Users\Nick\.local\share\mamba\envs
          package cache : C:\Users\Nick\.local\share\mamba\pkgs
                          C:\Users\Nick\.mamba\pkgs
                          C:\Users\Nick\AppData\Roaming\.mamba\pkgs
            environment : build_env (active)
           env location : C:\Users\Nick\.local\share\mamba\envs\build_env
      user config files : C:\Users\Nick\.mambarc
 populated config files :
       virtual packages : __win=10.0.26100=0
                          __archspec=1=x86_64
                          __cuda=12.4=0
               channels : https://conda.anaconda.org/conda-forge/noarch
                          https://conda.anaconda.org/conda-forge/win-64
       base environment : C:\Users\Nick\.local\share\mamba
               platform : win-64

Compiler: Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30154 for x64

@JohanMabille
Copy link
Member

@nbeerbower I think you're testing the wrong binary, the output version is 2.0.6.rc2, while it should be 2.0.6.

if (cpu_info[0] < 7)
{
return "x86_64";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of this, eax>7 is required to get supported instructions for v4, but instructions for lower versions (such as v3) are checked with a lower value of eax (see my other comment below).

bool avx2 = cpu_info[1] & (1 << 5);
bool bmi2 = cpu_info[1] & (1 << 8);
bool fma_ = cpu_info[1] & (1 << 12);
bool avx = cpu_info[1] & (1 << 28);
Copy link
Member

@JohanMabille JohanMabille Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The support for fma and avx should be checked after a call to cpuid with eax = 1, and read from ecx (i.e. cpu_info[2]), see https://en.wikipedia.org/wiki/CPUID#EAX=1:_Processor_Info_and_Feature_Bits

Sorry for not catching it during the first review.

bool sse41 = cpu_info[1] & (1 << 19);
bool sse42 = cpu_info[1] & (1 << 20);
bool popcnt = cpu_info[1] & (1 << 23);
if (sse3 && ssse3 && sse41 && sse42 && popcnt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark here, the support for all these instructions should be read from cpu_info[2] after call to cpuid with eax=1.

jjerphan and others added 2 commits February 14, 2025 10:17
Signed-off-by: Julien Jerphanion <[email protected]>

Co-authored-by: Johan Mabille <[email protected]>
Signed-off-by: Julien Jerphanion <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release::bug_fixes For PRs fixing bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Archspec not implemented for Windows
5 participants