You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In preparation for the transition to the next-generation CCPP framework code generatore capgen, we should try to convert the blocked GFS DDTs used in FV3atm and any code related to it to contiguous arrays. For the purpose of threading, we can use new CCPP framework capabilities in ccpp_prebuild that send chunks of the contiguous arrays to the physics during the time integration (run) phase for OpenMP parallel processing.
Most NWP models have moved to contiguous arrays + chunking as described above, because performance comparisons (for example at UCAR/CISL) showed better performance. In addition, the code is less complicated.
Solution
Implement the above solution. Ideally, the regression tests all pass with b4b identical results. This means making fewer changes than potentially possible to achieve b4b.
Compare performance of the new code against the existing code
This will open the possibilities to further optimize the code in a follow-up PR.
Alternatives
Implement blocked data structures in the next generation CCPP framework code generator.
@FernandoAndrade-NOAA I'm reopening this Issue to reference in an upcoming PR that builds upon #2183.
(In #2183 the @climbfuji mentions this effort was "working towards" #2294, not completing it.)
Description
In preparation for the transition to the next-generation CCPP framework code generatore
capgen
, we should try to convert the blocked GFS DDTs used in FV3atm and any code related to it to contiguous arrays. For the purpose of threading, we can use new CCPP framework capabilities inccpp_prebuild
that send chunks of the contiguous arrays to the physics during the time integration (run
) phase for OpenMP parallel processing.Most NWP models have moved to contiguous arrays + chunking as described above, because performance comparisons (for example at UCAR/CISL) showed better performance. In addition, the code is less complicated.
Solution
This will open the possibilities to further optimize the code in a follow-up PR.
Alternatives
Implement blocked data structures in the next generation CCPP framework code generator.
Related to
NCAR/ccpp-framework#314
The text was updated successfully, but these errors were encountered: