Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hera is moving from scratch1,2 to scratch3,4 - modulefiles updated #2579

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

RatkoVasic-NOAA
Copy link
Collaborator

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • Commit 'test_changes.list' from previous step

Description:

Hera is switching from old scratch1 and scratch2 to scratch3 and scratch4 disks.
Current spack-stack used by UFS-WM (1.6.0/fms-2024.01) is installed at new place:
/contrib/spack-stack/spack-stack-1.6.0/envs/

Commit Message:

* UFSWM/modulefiles

Branch passed all tests on Hera (logs attached).

Priority:

  • Normal

Git Tracking

UFSWM:

Sub component Pull Requests:

  • None

UFSWM Blocking Dependencies:

  • None

Changes

  • No Baseline Changes.

Input data Changes:

  • None.

Library Changes/Upgrades:

Libraries already installed on new disks.


Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • GaeaC5
    • GaeaC6
    • Derecho
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

@DavidHuber-NOAA
Copy link
Collaborator

Could the UPP hash also be updated with this PR? I know that will update the spack-stack version to 1.8.0 for the UPP, but it will point to a /contrib installation.

@RatkoVasic-NOAA
Copy link
Collaborator Author

Could the UPP hash also be updated with this PR? I know that will update the spack-stack version to 1.8.0 for the UPP, but it will point to a /contrib installation.

@DavidHuber-NOAA I'm OK with that, but It doesn't depend on me. If UFS-WM code managers (@jkbk2004) are OK, then we (they) can add it while merging.

@ulmononian
Copy link
Collaborator

@BinLiu-NOAA @BijuThomas-NOAA not sure when the hash of ufs wm will update in hafs, but just fyi for hera modulefile changes that would be needed when the hash does get updated

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Feb 6, 2025

@FernandoAndrade-NOAA can you try to run all gnu cases of this pr on hera ? we may need to confirm if we see same issue. can be permission issue on my side.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Feb 6, 2025

@RatkoVasic-NOAA I am not sure the issue is on my side but rt.sh has an issue like

Also make sure that all modulefiles written in TCL start with the string
#%Module

Executing this command requires loading "gnu/13.3.0" which failed while
processing the following module(s):

    Module fullname   Module Filename
    ---------------   ---------------
    stack-gcc/13.3.0  /contrib/spack-stack/spack-stack-1.6.0/envs/gnu-fms-2024.01/install/modulefiles/Core/stack-gcc/13.3.0.lua
    modules.fv3       /scratch1/NCEPDEV/stmp2/Jong.Kim/FV3_RT/rt_2158566/control_c48_gnu/modulefiles/modules.fv3.lua

@RatkoVasic-NOAA
Copy link
Collaborator Author

@RatkoVasic-NOAA I am not sure the issue is on my side but rt.sh has an issue like

@jkbk2004 please give me the path and log file, so I can try to replicate your error.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Feb 6, 2025

@RatkoVasic-NOAA I am not sure the issue is on my side but rt.sh has an issue like

@jkbk2004 please give me the path and log file, so I can try to replicate your error.

Path is /scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera/run_control_c48_gnu.log

@RatkoVasic-NOAA
Copy link
Collaborator Author

Path is /scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera/run_control_c48_gnu.log

That run directory does not exist anymore:
ll /scratch1/NCEPDEV/stmp2/Jong.Kim/FV3_RT/rt_2158566/control_c48_gnu

But, I looked in your /scratch1/NCEPDEV/stmp2/Jong.Kim/FV3_RT/rt_2479708/control_c48_gnu/modulefiles directory:

Hera:/scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera>module use /scratch1/NCEPDEV/stmp2/Jong.Kim/FV3_RT/rt_2479708/control_c48_gnu/modulefiles
Hera:/scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera>module load modules.fv3
Hera:/scratch2/NCEPDEV/marine/Jong.Kim/UFS-RT/rt-2579/tests/logs/log_hera>module list

Currently Loaded Modules:
  1) stack-gcc/13.3.0      8) libjpeg/2.1.0      15) zstd/1.5.2              22) fms/2024.01.02        29) ip/4.3.0           36) libxcrypt/4.4.35        43) nccmp/1.9.0.1
  2) gnu/13.3.0            9) jasper/2.0.32      16) c-blosc/1.21.5          23) bacio/2.4.1           30) sp/2.5.0           37) sqlite/3.43.2           44) modules.fv3
  3) openmpi/4.1.6        10) zlib/1.2.13        17) netcdf-c/4.9.2          24) crtm-fix/2.4.0.1_emc  31) w3emc/2.10.0       38) util-linux-uuid/2.38.1
  4) stack-openmpi/4.1.6  11) libpng/1.6.37      18) netcdf-fortran/4.6.1    25) git-lfs/2.10.0        32) gftl/1.10.0        39) python/3.10.13
  5) nghttp2/1.57.0       12) pkg-config/0.27.1  19) parallel-netcdf/1.12.2  26) crtm/2.4.0.1          33) gftl-shared/1.6.1  40) mapl/2.40.3-esmf-8.6.0
  6) curl/8.4.0           13) hdf5/1.14.0        20) parallelio/2.5.10       27) g2/3.5.1              34) fargparse/1.5.0    41) scotch/7.0.4
  7) cmake/3.23.1         14) snappy/1.1.10      21) esmf/8.6.0              28) g2tmpl/1.13.0         35) gettext/0.19.8.1   42) ufs_common

Please do module purge, and repeat those 3 lines. (module use ..., module load..., module list). Let's see if it is something to do with your environment.

@ulmononian
Copy link
Collaborator

maybe for another PR, but UFS-WM baselines & input data need to be moved to scratch3/4. also, dprefix, DISKNM, STMP, & PTMP for hera in rt.sh (https://github.com/RatkoVasic-NOAA/ufs-weather-model/blob/e9c789f1c7566fa527038d77190683c480a91cec/tests/rt.sh#L789-L792) should be updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hera is moving from scratch1,2 disks to scratch3,4 - update modulefiles
5 participants