-
Notifications
You must be signed in to change notification settings - Fork 16
Spack for GEOSgcm Benchmark
This document outlines how to install GEOSgcm using Spack.
Note: in this example, we will be using the gcc@14.2.0 compiler. This is being chosen
mainly due to odd behavior currently seen in testing with the Intel oneAPI compilers
and Spack.
To clone Spack:
git clone -c feature.manyFiles=true --depth=2 https://github.com/spack/spack.gitNext, once you have spack cloned, you should enable the shell integration by sourcing the correct setup-env file. For example, for bash:
export SPACK_ROOT=$HOME/spack
. $SPACK_ROOT/share/spack/setup-env.shOf course, update the SPACK_ROOT variable to point to the location of your spack clone.
This is also recommended to be put in ~/.bashrc or similar for ease of use, or can
be run each time you open a new shell.
On some systems, the default TMPDIR might be limited or in a problematic
location. If so, update it to a location with more space and add to ~/.bashrc:
mkdir -p $HOME/tmpdir
echo 'export TMPDIR=$HOME/tmpdir' >> ~/.bashrcFrom https://spack.readthedocs.io/en/latest/getting_started.html#system-prerequisites there are pre-requisites for spack. For example, on Ubuntu:
sudo apt update
sudo apt install bzip2 ca-certificates g++ gcc gfortran git gzip lsb-release patch python3 tar unzip xz-utils zstdIf you are using a different operating system, you might need to translate these into the packaging system for your OS. The above link has a similar list for RHEL.
Spack configuration files are (by default) in ~/.spack. We will be updating
the packages.yaml and repos.yaml files.
mkdir -p ~/.spackThe complete packages.yaml file we will use is:
packages:
all:
providers:
mpi: [openmpi]
blas: [intel-oneapi-mkl]
lapack: [intel-oneapi-mkl]
hdf5:
variants: +fortran +szip +hl +threadsafe +mpi
netcdf-c:
variants: +dap
esmf:
variants: ~pnetcdf ~xerces
cdo:
variants: ~proj ~fftw3
pflogger:
variants: +mpi
pfunit:
variants: +mpi +fhamcrest
fms:
variants: ~gfs_phys +pic ~yaml constants=GEOS precision=32,64 +deprecated_io
mapl:
variants: +extdata2g +fargparse +pflogger +pfunit ~pnetcdfCopy this into ~/.spack/packages.yaml.
This not only hardcodes the compiler and MPI stack (see below), but also the default variants for packages Spack will build. This eases the spack commands that will be used later (for convenience).
Spack is currently undergoing a change in how it handles compilers. As such packages.yaml (seen above)
used to need:
packages:
all:
compiler: [gcc@=14.2.0]but that now generates warnings from Spack like:
==> Warning: The packages:all:compiler preference has been deprecated in Spack v1.0, and is currently ignored. It will be removed from config in Spack v1.2.
Moreover, Spack used to use a file called ~/.spack/linux/compilers.yaml to manage compilers. That is now defunct. Compiler information
is controlled in packages.yaml now.
As seen above, the packages.yaml file is used to configure the packages that are available to spack.
In this example, we are telling it we want:
- GNU Fortran as our Fortran compiler
- Open MPI as our MPI stack
- Intel MKL as our BLAS and LAPACK stack
GEOSgcm is not required to use these specific compilers. We use GCC 14 here as a reliable default. GEOSgcm supports for Fortran compilers:
- GCC 13+
- Intel Fortran Classic (
ifort) 2021.6+ - Intel OneAPI Fortran (
ifx) 2025.0+
And MPI stacks that have been tested are:
- Open MPI
- Intel MPI
- MPICH
- MPT
It's possible MVAPICH2 might also work, but issues have been seen when using it on Discover, but time has not been invested to determine the actual issue.
NOTE 1: This version of GEOSgcm has a requirement for Intel MKL for the BLAS and LAPACK. This is being worked on, but for now it is required.
NOTE 2: GEOSgcm does not have a strong dependence on the C and C++ compiler, we mainly focus on Fortran compilers in our testing.
Some testing with Open MPI as found that the following variants might be useful or even
need for packages.yaml on systems using SLURM. For example using Open MPI 4.1:
openmpi:
require:
- "@4.1.7"
- "%gcc@14.2.0"
buildable: True
variants: +legacylaunchers +internal-hwloc +internal-libevent +internal-pmix schedulers=slurm fabrics=ucx
slurm:
buildable: False
externals:
- spec: slurm@23.11.10
prefix: /usr
ucx:
buildable: False
externals:
- spec: ucx@1.18.0
prefix: /usrNOTE 1: You will probably want to change the SLURM and UCX external specs to match the version of SLURM on your system.
NOTE 2: Testing has found that the system UCX is about the only "good" way we have found for Open MPI to see
things like Infiniband interfaces (e.g., mlx5_0).
We need an additional repo that has the recipe package.py file for GEOSgcm.
First clone the repository with:
git clone https://github.com/GMAO-SI-Team/geosesm-spack.gitNow, assuming that was in the home directory, you can add it to your
repos.yaml file with:
spack repo add $HOME/geosesm-spack/spack_repo/geosesmThis should result in a repos.yaml file that looks like:
repos:
- /home/ubuntu/geosesm-spack/spack_repo/geosesmwhere, of course, the path for your $HOME might be different.
It's possible changes might be made to the geosesm-spack repo as time goes on. If updates are needed, you can update the repo with:
cd $HOME/geosesm-spack
git pullTo install GCC 14.2.0 with Spack run:
spack install gcc@14.2.0You then also need to tell Spack where the compiler is with:
spack compiler find $(spack location -i gcc@14.2.0)When you do that, you'll see it in the ~/.spack/packages.yaml file.
If your system has GCC 14.2.0, you can instead just install it via your system package manager. Then you can run:
spack compiler findmight find it. You can check if it found it by looking at
~/.spack/packages.yaml and there should be entries
for gcc
You can now install GEOSgcm with:
spack install geosgcm@12.0.0-rc.2 %gcc@14.2.0On a test install, this built about 150 packages.
HugeBCs is a portable set of boundary conditions for the GEOSgcm model. In the example
below, we will be using the c1440 resolution.
You can get HugeBCs from the following link:
NEED A LINK HERE
First, change to the install directory of GEOSgcm:
cd $(spack location -i geosgcm %gcc@14.2.0)/binWe will be setting up an experiment using the create_expt.py script in HugeBCs.
In the following example, the name of the experiment will be test-c1440:
/path/to/HugeBCs-GitV12/scripts/create_expt.py test-c1440 \
--expdir /path/to/experiment/directory \
--horz c1440 --vert 181 --heartbeat 300 \
--ocean CS --landbcs NL3 --emission OPS --nooserverThis sets up the experiment in the directory /path/to/experiment/directory/test-c1440. The other options say:
-
--horz c1440: use c1440 horizontal resolution -
--vert 181: use 181 vertical levels -
--heartbeat 300: set the heartbeat to 300 seconds -
--ocean CS: use the cubed-sphere ocean -
--landbcs NL3: use the NL3 land boundary conditions -
--emission OPS: use the OPS emissions -
--nooserver: do not use the OS server (as we will not be outputting any data)
Next change to the experiment directory:
cd /path/to/experiment/directory/test-c1440And now we will use the makeoneday.bash script to set up the experiment:
/path/to/HugeBCs-GitV12/scripts/makeoneday.bash link v12bench 10dyThis tells the experiment to:
-
link: link restarts into thescratch/directory -
v12bench: use the v12 benchmark settings -
10dy: run for 10 days
When all is done, you will have a script called gcm_run.j that runs the experiment. So with slurm
you'd do:
sbatch gcm_run.jBut, you will need to edit the top of the script
to have it know about the spack environment and perhaps
to get the correct #SBATCH pragmas.
By default, the gcm_run.j script will not know about the spack environment. You need to add
the following lines to the top of the script:
limit stacksize unlimited
limit coredumpsize 0
source /path/to/spack/share/spack/setup-env.csh
spack load geosgcm %gcc@14.2.0where the latter two lines are the ones you need to add. Note:
gcm_run.j is a csh script, so you need to source the setup-env.csh
file (even if your working shell is bash).
This will ensure that the geosgcm package is loaded and the
libraries, etc. are found.
By default, a c1440 experiment will use 38400 cores which corresponds to in AGCM.rc
NX: 80NY: 480
and in your gcm_run.j script, you will see:
#SBATCH --ntasks=38400On NCCS machines this will be:
#SBATCH --nodes=320 --ntasks-per-node=120You can change the number of cores used by changing the NX and NY values in AGCM.rc and
the --ntasks value in the gcm_run.j script. But, as detailed in AGCM.rc:
# Some rules of thumb when changing processor counts:
#
# 1) Start from the NX/NY that come from gcm_setup
# 2) The gcm will not run with AGCM_IM/NX or AGCM_JM/NY/6 less than or equal to 3
# 3) if you want more cores, double NY (for 2x), then double NX(for 4x) and so on...
# 4) if you want fewer cores, half NX (for 0.5), then half NY for (0.25) and so on...
# 5) when you hit an odd number we need to get more creative