Skip to content

Commit a97cf56

Browse files
syed-ahmedfacebook-github-bot
authored andcommitted
Alignas Array struct (pytorch#14920)
Summary: This PR aligns the Array struct such that cuda vector performance improvements can be utilized. I tested this by using it on our Philox header. Note how the vector store instruction gets used for cuda vector types and when using alignas on Array, vs when not using alignas on Array. With cuda vector type (uint4, uint2, float4): https://godbolt.org/z/UaWOmR With alignas: https://godbolt.org/z/Eeh0t5 Without alignas: https://godbolt.org/z/QT63gq Pull Request resolved: pytorch#14920 Differential Revision: D13406751 Pulled By: soumith fbshipit-source-id: 685b1010ef1f576dde30c278b1e9b642f87c843d
1 parent 7e2b074 commit a97cf56

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

aten/src/ATen/cuda/Array.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,11 @@
77
namespace at { namespace cuda {
88

99
template <typename T, int size>
10+
#ifndef __HIP_PLATFORM_HCC__
11+
struct alignas(16) Array {
12+
#else
1013
struct Array {
14+
#endif
1115
T data[size];
1216

1317
C10_HOST_DEVICE T operator[](int i) const {

0 commit comments

Comments
 (0)