GridTools version 1.1.0
GridTools
In GridTools v1.1.0 we set the default C++ standard to C++14 and drop compatibility for C++11. This requires at least CUDA 9.0.
Changes since v1.0.0
Full introduction of the SID concept
The backend is completely restructured based on the SID (stencil iteratable data) concept. There should be no user facing changes as long as user code was only using documented public API (*). The changes separate backend implementation from the core library to allow non intrusive extension of the library with new backends. Additionally maintainability of the gridtools infrastructure is significantly improved.
Performance should be improved in general, but might be worse for specific computations. A common pattern for performance improvement/degradation is not observed.
(*) There is one change which might trigger different behavior (though the old behavior was not documented): temporary fields are now implicitly 3 dimensional. Prior to this version the user could have abused a 2D temporary field for accumulating values between k-levels.
New
- New example illustrating the type-erasure pattern for computations. #1318
Deprecation (support will be removed in GridTools v2.0.0)
- Using the gridtools::c_bindings is deprecated. Switch to the standalone https://github.com/GridTools/cpp_bindgen.
global_accessor
is deprecated, usein_accessor
(without extents) instead.make_global_parameter
withbackend
as template parameter is deprecated. Thebackend
is not needed anymore.
Fixes / Cleanup
- Fix performance for CUDA 9.2 / 10.0 #1281 #1327 #1339
- Use c++14 features. #1307
- Use multiple threads in storage Initialization. #1300
- Remove dependency on boost::mpl and boost::fusion
- Fixes required to compile gridtools with HIP-Clang. Full support for AMD GPUs via HIP-Clang will come in a next release. #1363
- Fix a bug in communication #1355.
- The
global_parameter
doesn't require pre-allocated storage (as it is now passed via constant memory in case of CUDA), thereforeglobal_parameter
is a lightweight wrapper around the value type, which can be created without overhead, e.g. when passing it tocomputation.run()
.