-
Notifications
You must be signed in to change notification settings - Fork 47
Resize PIA 2nd dimension to R%NSMAX+4 #246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
This is to make it consistent with the CPU version and to avoid out-of-bounds array accesses when computing spectral gradients (e.g. in UVTVD and SPNSDE).. I have also changed a few instances of R%NTMAX to R%NSMAX also just to ensure consistency with the CPU version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think this makes sense. We can later think about if there is a way to make the indexing in PIA more natural (or at least document it)
| #endif | ||
| DO KMLOC=1,D_NUMP | ||
| DO JN=0,R_NSMAX+3 | ||
| DO JN=0,R_NSMAX+4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @samhatfield, I might be misunderstanding your change, so apologies if so. Changing the second dimension of PIA to be R%NSMAX+4 makes sense to me. However, I think this subroutine in its original form was already writing to R%NSMAX+4 and so didn't need changing? In the original code JN is in the interval [0, R_NSMAX+3], but whenever JN is used for indexing it is used like JN+1 which lies in the intervals [1, R_NSMAX+4]. Where as with the new loop bound and if-condition JN+1 will lie in the intervals [1, R_NSMAX+5]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right, and yes I also did it this way in my attempt to revert pia (which I believe is more natural than what we do here, because indexing is very complex): main...lukasm91:ectrans:fix_pia
|
@samhatfield I've just created this issue, #253, because I think there is a similar issue for |
In the CPU version,
ZIA, which stores the input to the Legendre transform for all fields (local toLTINV) has dimensions(R%NLEI1,KLEI2) = (R%NSMAX+4,NFIELDS). The first dimension seems odd, but basically we haveR%NSMAX+1values going from0toR%NSMAX. We then add one more row, which I'm not sure the reason for yet. Then finally we add two more rows becauseZIAwill be used to also store gradients which are computed using a central differencing +1/-1 approach, so we need two more at the beginning and end so that every "normal" point has a gradient calculated. SoR%NSMAX+4in total.In the GPU version we are currently allocating the equivalent array
PIAofLTINVwith dimensions(2*IF_READIN,R%NTMAX+3,D%NUMP)(unlike the CPU version which is defined inside a loop over zonal wavenumber,PIAstores data for all zonal wavenumbers, hence the third dimension).R%NTMAX = R%NSMAX(which is another issue...) soPIAonly hasR%NSMAX+3elements in the second dimension. However, it still needs another one like in the CPU version so that the gradient is calculated correctly. In fact the loop for calculating north-south derivatives of scalar fields accesses elementR%NSMAX+4so there is an out-of-bounds array access currently. See the loop inSPNSDE(look at theKM=0,JN=0iteration and note thatPFis a pointer to a slice ofZIAover the first dimension).According to @lukasm91, this out-of-bounds probably wasn't dangerous as in all cases we can see, the out-of-bounds access is multipled by zero (e.g.
ZEPSNM(1,0)andF%RLAPIN. Still, it's good to have the dimensions consistent with the loop limits.This PR enlarges
PIAso it matchesZIA.Note that
R%NTMAX = R%NSMAXfor all intents and purposes so I replaced someNTMAXwithNSMAXto be consistent with the CPU version.Thanks @l90lpa for starting this discussion in #242.