Enable builds without direct torch.cuda availability and support sm89 / sm90. #5

wanderingai · 2023-11-22T01:15:53Z

This PR allows for monarch_cuda kernel to be built for computes sm80, sm89, and sm90, which includes the following GPUs:

A100
H100
L40
RTX 6000 Ada

Additionally the setup.py is updated to enable builds based on nvcc availability but without direct torch.cuda availability for flexible builds.

Update:

The compiler flags have been abstracted to support both PTX and SASS builds while defaulting to the original Ampere-based PTX-only build i.e. -gencode=arch=compute_80,code=compute_80.

Successfully tested by building a docker image and running tests under tests/:

…lability.

DanFu09 · 2023-11-22T04:17:53Z

It looks like this PR is introducing some race conditions - when I install using this branch, some tests fail:

pytest -s -q tests/test_flashfftconv.py
Running 1120 items in this shard
......................................................................................................................................................................................................................
F.....................................................................................................................................................................................................................
................................................................F.F...F...............................................................................................................................................
......................................................................................................................................................................................................................
......................................................................................................................................................................................................................
..................................................

… ptx build.

michaelfeil · 2024-01-16T13:29:21Z

@wanderingai Love this PR, its a improvement from the previous setup.py. Can this be merged, worst case with no sm_90 flags by default?

Enable sm89 and sm90 builds and allow builds without direct cuda avai…

9461b6b

…lability.

DanFu09 self-requested a review November 22, 2023 04:18

Support both ptx and sass builds and default to original ampere-based…

b1f4f83

… ptx build.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable builds without direct torch.cuda availability and support sm89 / sm90. #5

Enable builds without direct torch.cuda availability and support sm89 / sm90. #5

wanderingai commented Nov 22, 2023 •

edited

Loading

DanFu09 commented Nov 22, 2023

michaelfeil commented Jan 16, 2024

Enable builds without direct torch.cuda availability and support sm89 / sm90. #5

Are you sure you want to change the base?

Enable builds without direct torch.cuda availability and support sm89 / sm90. #5

Conversation

wanderingai commented Nov 22, 2023 • edited Loading

DanFu09 commented Nov 22, 2023

michaelfeil commented Jan 16, 2024

wanderingai commented Nov 22, 2023 •

edited

Loading