-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debugging non-reproducibility of ACCESS-NRI 0.3.1 -> 0.4.0 #173
Conversation
!test repro |
✅ The Bitwise Reproducibility Check Succeeded ✅ When comparing:
Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om3-configs/tree/90a3e99186d6c8548b4892bbde46b08067299949/testing/checksum |
!test repro |
✅ The Bitwise Reproducibility Check Succeeded ✅ When comparing:
Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om3-configs/tree/90a3e99186d6c8548b4892bbde46b08067299949/testing/checksum |
!test repro |
✅ The Bitwise Reproducibility Check Succeeded ✅ When comparing:
Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om3-configs/tree/90a3e99186d6c8548b4892bbde46b08067299949/testing/checksum |
!test repro |
❌ The Bitwise Reproducibility Check Failed ❌ When comparing:
Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om3-configs/tree/90a3e99186d6c8548b4892bbde46b08067299949/testing/checksum |
!test repro |
❌ The Bitwise Reproducibility Check Failed ❌ When comparing:
Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om3-configs/tree/90a3e99186d6c8548b4892bbde46b08067299949/testing/checksum |
…versions used in 0.4.0 This also inclusde updating ESMF from 8.5.0 to 8.7.0. Unfortunately, the new versions of CMEPS/CDEPS require updating ESMF and the old versions don't work with the updated ESMF. This makes it very difficult to test just updating ESMF
!test repro |
❌ The Bitwise Reproducibility Check Failed ❌ When comparing:
Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om3-configs/tree/90a3e99186d6c8548b4892bbde46b08067299949/testing/checksum |
Summarising what this PR shows: MOM
CESM-share
CICE
CMEPS/CDEPS
I'll open a separate issue/PR with suggestions for how to set the MOM6 parameters for the update to 0.4.0. I'm a little worried that using the new MOM supergrid functionality in CICE causes MOM velocity truncations. Should we dig into this a little before we commit to using it in ACCESS-OM3? @anton-seaice, @chrisb13? |
Thanks for exploring this and laying it out so clearly. Can we conclude that there's a problem with the supergrid implementation in CICE ( Is it expected that the CICE and ESMF updates break reproducibility? |
Possibly... I think it's unclear at this stage, but we are going to revert back to the old grid while we investigate.
I'll defer to @anton-seaice re CICE. Regarding the ESMF update from 8.5.0 to 8.7.0, I'd say no. Looking at the changelog, there's only one reported bfb change between these two releases (in 8.6.0) and that should only be observed when not using strict floating point compiler options. We use |
Its not immediately clear the answers should have changed. ACCESS-NRI/CICE@12dd204...e68e05b There are some updates which are not bit for bit in there but none look like their should impact our configurations. |
Yes lets drop the commit for now and make a new issue to investigate it |
!test repro |
✅ The Bitwise Reproducibility Check Succeeded ✅ When comparing:
Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om3-configs/tree/90a3e99186d6c8548b4892bbde46b08067299949/testing/checksum |
This passed repro tests (see 47ac794 and repro test) which suggests that the answer changes that arise from the CMEPS/CDEPS/ESMF updates in ACCESS-OM3 0.4.0 come from CMEPS/CDEPS rather than ESMF (since the ESMF changelog reports full bfb reproducibility between ESMF 8.6.0 and 8.7.0). CMEPS changes: ESCOMP/CMEPS@ffb5737...959e9a0 |
!test repro |
✅ The Bitwise Reproducibility Check Succeeded ✅ When comparing:
Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om3-configs/tree/90a3e99186d6c8548b4892bbde46b08067299949/testing/checksum |
!test repro |
❌ The Bitwise Reproducibility Check Failed ❌ When comparing:
Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om3-configs/tree/90a3e99186d6c8548b4892bbde46b08067299949/testing/checksum |
Closing as I think we've learnt what we wanted to about the loss of historical repro when we updated from 0.3.1 to 0.4.0. |
DO NOT MERGE
Companion PR to ACCESS-NRI/ACCESS-OM3#47 to debug and document where we lost reproducibility when we updated from 0.3.1 to 0.4.0.