Skip to content

Conversation

@AlexanderSinn
Copy link
Member

@AlexanderSinn AlexanderSinn commented Jan 28, 2026

Summary

This PR aims to provide a unified interface to be able to write kernels using shared memory and __syncthreads for CUDA, HIP and SYCL without the need to use ifdefs.

The number of threads per block is always a compile-time known 1D value, while the number of blocks can be 1d, 2d or 3d using the build-in platform indexes like blockIdx.y etc.

TODO:

  • Perf testing for some existing kernels
  • porting/simplifying existing kernels to use this
  • Write documentation
  • add tests

Additional background

Checklist

The proposed changes:

  • fix a bug or incorrect behavior in AMReX
  • add new capabilities to AMReX
  • changes answers in the test suite to more than roundoff level
  • are likely to significantly affect the results of downstream AMReX users
  • include documentation in the code and/or rst files, if appropriate

@ax3l ax3l added the GPU label Jan 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants