Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FATES branch runs #2955

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

samsrabin
Copy link
Collaborator

@samsrabin samsrabin commented Feb 6, 2025

Description of changes

In FATES branch runs, prevents attempts to re-allocate arrays that already are allocated (in the restart file).

This prevents run failures. However, it exposes a failure in the COMPARE_base_hybrid step of ERI tests, due to a bad hist_mfilt. This PR will also fix that.

Remaining tasks:

  • Fix COMPARE_base_hybrid failure.
  • Add FATES ERI tests to aux_clm and fates test suites.

Notes for testing

Baseline comparisons may fail for the following testmods:

  • Fates
  • FatesColdAllVars
  • FatesColdDryDepSatPhen
  • FatesColdHydro
  • FatesColdMeganSatPhen
  • FatesColdSatPhen
  • fire_emis
  • glcMEC_long

However, this is only because I have changed hist_mfilt for those to 1. (From 365 for the Fates testmods; from 30 for fire_emis; from 12 for glcMEC_long). This ensures that the dates in the history filenames are the same for both parts of the ERI test.

When this is next in the merge queue:

  1. Rebase onto the latest tag.
  2. Check out the commit from before changing mfilt. Run aux_clm and fates suites, comparing to the latest baseline, expecting no diffs.
  3. Check out the tip of this branch. Run aux_clm and fates suites, comparing to the latest baseline, expecting baseline comparison failures (a) only for those tests and (b) only because of missing baseline files.

Specific notes

Contributors other than yourself, if any: @ckoven

CTSM Issues Fixed:

Are answers expected to change (and if so in what way)? No, although see "Notes for testing" above.

Any User Interface Changes (namelist or namelist defaults changes)? No

Does this create a need to change or add documentation? Did you do so? No

Testing performed, if any:

  ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates (Overall: FAIL) details:
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates CREATE_NEWCASE
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates XML
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates SETUP
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates SHAREDLIB_BUILD time=220
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates MODEL_BUILD time=46
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates SUBMIT
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates RUN time=260
    FAIL ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates COMPARE_base_hybrid
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates COMPARE_base_rest
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates MEMLEAK
    PASS ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates SHORT_TERM_ARCHIVER

From /glade/derecho/scratch/samrabin/tests_0206-090735de/ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates.0206-090735de/TestStatus.log:

    ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates.0206-090735de.cpl.hi.2002-02-20-00000.nc.base matched ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates.0206-090735de.cpl.hi.2002-02-20-00000.nc.hybrid
FAIL
WARNING
Expected to compare 2 hist files, but only compared 1. It's possible
that the hist_file_extension entry in config_archive.xml is not correct
for some of your components.

@samsrabin samsrabin added bug something is working incorrectly test: aux_clm Pass aux_clm suite before merging test: fates Pass fates test suite before merging labels Feb 6, 2025
@samsrabin samsrabin added this to the ctsm6.0.0 (code freeze) milestone Feb 6, 2025
@samsrabin samsrabin added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Feb 6, 2025
@samsrabin
Copy link
Collaborator Author

samsrabin commented Feb 6, 2025

Two files are getting produced in each run, but only the cpl.hi output file has matching dates:

ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates.0206-090735de.cpl.hi.2002-02-20-00000.nc.base
ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates.0206-090735de.cpl.hi.2002-02-20-00000.nc.hybrid

The clm2.h0 output file has mismatched dates:

ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates.0206-090735de.clm2.h0.2002-01-28-00000.nc.base
ERI_Ld60.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-Fates.0206-090735de.clm2.h0.2002-01-02-00000.nc.hybrid

The entire ref2 run is completing, so it's not that. It just seems like either it's not writing the outputs or the archive step is messing up.

@ekluzek any idea why this might be happening?

@samsrabin
Copy link
Collaborator Author

Figured it out—mfilt is set wrong. Fixing.

@samsrabin samsrabin marked this pull request as ready for review February 10, 2025 20:04
@samsrabin samsrabin requested review from rgknox and glemieux February 10, 2025 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something is working incorrectly next this should get some attention in the next week or two. Normally each Thursday SE meeting. test: aux_clm Pass aux_clm suite before merging test: fates Pass fates test suite before merging
Projects
None yet
1 participant