Skip to content

Conversation

@samhatfield
Copy link
Collaborator

@samhatfield samhatfield commented Mar 31, 2025

In the CPU version, ZIA, which stores the input to the Legendre transform for all fields (local to LTINV) has dimensions (R%NLEI1,KLEI2) = (R%NSMAX+4,NFIELDS). The first dimension seems odd, but basically we have R%NSMAX+1 values going from 0 to R%NSMAX. We then add one more row, which I'm not sure the reason for yet. Then finally we add two more rows because ZIA will be used to also store gradients which are computed using a central differencing +1/-1 approach, so we need two more at the beginning and end so that every "normal" point has a gradient calculated. So R%NSMAX+4 in total.

In the GPU version we are currently allocating the equivalent array PIA of LTINV with dimensions (2*IF_READIN,R%NTMAX+3,D%NUMP) (unlike the CPU version which is defined inside a loop over zonal wavenumber, PIA stores data for all zonal wavenumbers, hence the third dimension).

R%NTMAX = R%NSMAX (which is another issue...) so PIA only has R%NSMAX+3 elements in the second dimension. However, it still needs another one like in the CPU version so that the gradient is calculated correctly. In fact the loop for calculating north-south derivatives of scalar fields accesses element R%NSMAX+4 so there is an out-of-bounds array access currently. See the loop in SPNSDE (look at the KM=0, JN=0 iteration and note that PF is a pointer to a slice of ZIA over the first dimension).

According to @lukasm91, this out-of-bounds probably wasn't dangerous as in all cases we can see, the out-of-bounds access is multipled by zero (e.g. ZEPSNM(1,0) and F%RLAPIN. Still, it's good to have the dimensions consistent with the loop limits.

This PR enlarges PIA so it matches ZIA.

Note that R%NTMAX = R%NSMAX for all intents and purposes so I replaced some NTMAX with NSMAX to be consistent with the CPU version.

Thanks @l90lpa for starting this discussion in #242.

This is to make it consistent with the CPU version and to avoid
out-of-bounds array accesses when computing spectral gradients (e.g. in
UVTVD and SPNSDE).. I have also changed a few instances of R%NTMAX to
R%NSMAX also just to ensure consistency with the CPU version.
@samhatfield samhatfield added bug Something isn't working gpu labels Mar 31, 2025
@samhatfield samhatfield requested a review from lukasm91 March 31, 2025 15:35
Copy link
Collaborator

@lukasm91 lukasm91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think this makes sense. We can later think about if there is a way to make the indexing in PIA more natural (or at least document it)

#endif
DO KMLOC=1,D_NUMP
DO JN=0,R_NSMAX+3
DO JN=0,R_NSMAX+4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @samhatfield, I might be misunderstanding your change, so apologies if so. Changing the second dimension of PIA to be R%NSMAX+4 makes sense to me. However, I think this subroutine in its original form was already writing to R%NSMAX+4 and so didn't need changing? In the original code JN is in the interval [0, R_NSMAX+3], but whenever JN is used for indexing it is used like JN+1 which lies in the intervals [1, R_NSMAX+4]. Where as with the new loop bound and if-condition JN+1 will lie in the intervals [1, R_NSMAX+5]?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right, and yes I also did it this way in my attempt to revert pia (which I believe is more natural than what we do here, because indexing is very complex): main...lukasm91:ectrans:fix_pia

@l90lpa
Copy link
Contributor

l90lpa commented Apr 10, 2025

@samhatfield I've just created this issue, #253, because I think there is a similar issue for PIA POA in the dir_trans code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working gpu

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants