Skip to content

Spack for GEOSgcm Benchmark

Matt Thompson edited this page Jul 31, 2025 · 10 revisions

Spack for GEOSgcm Benchmark

This document outlines how to install GEOSgcm using Spack.

Note: in this example, we will be using the gcc@14.2.0 compiler. This is being chosen mainly due to odd behavior currently seen in testing with the Intel oneAPI compilers and Spack.

Cloning Spack

To clone Spack:

git clone -c feature.manyFiles=true --depth=2 https://github.com/spack/spack.git

Shell configuration

Next, once you have spack cloned, you should enable the shell integration by sourcing the correct setup-env file. For example, for bash:

export SPACK_ROOT=$HOME/spack
. $SPACK_ROOT/share/spack/setup-env.sh

Of course, update the SPACK_ROOT variable to point to the location of your spack clone.

This is also recommended to be put in ~/.bashrc or similar for ease of use, or can be run each time you open a new shell.

Add TMPDIR to bashrc (optional)

On some systems, the default TMPDIR might be limited or in a problematic location. If so, update it to a location with more space and add to ~/.bashrc:

mkdir -p $HOME/tmpdir
echo 'export TMPDIR=$HOME/tmpdir' >> ~/.bashrc

Install spack needed packages

From https://spack.readthedocs.io/en/latest/getting_started.html#system-prerequisites there are pre-requisites for spack. For example, on Ubuntu:

sudo apt update
sudo apt install bzip2 ca-certificates g++ gcc gfortran git gzip lsb-release patch python3 tar unzip xz-utils zstd

If you are using a different operating system, you might need to translate these into the packaging system for your OS. The above link has a similar list for RHEL.

Update spack configuration files

Spack configuration files are (by default) in ~/.spack. We will be updating the packages.yaml and repos.yaml files.

Make spack configuration directory

mkdir -p ~/.spack

packages.yaml

The complete packages.yaml file we will use is:

packages:
  all:
    providers:
      mpi: [openmpi]
      blas: [intel-oneapi-mkl]
      lapack: [intel-oneapi-mkl]
  hdf5:
    variants: +fortran +szip +hl +threadsafe +mpi
  netcdf-c:
    variants: +dap
  esmf:
    variants: ~pnetcdf ~xerces
  cdo:
    variants: ~proj ~fftw3
  pflogger:
    variants: +mpi
  pfunit:
    variants: +mpi +fhamcrest
  fms:
    variants: ~gfs_phys +pic ~yaml constants=GEOS precision=32,64 +deprecated_io
  mapl:
    variants: +extdata2g +fargparse +pflogger +pfunit ~pnetcdf

Copy this into ~/.spack/packages.yaml.

This not only hardcodes the compiler and MPI stack (see below), but also the default variants for packages Spack will build. This eases the spack commands that will be used later (for convenience).

Spack 1.0 Differences

Spack is currently undergoing a change in how it handles compilers. As such packages.yaml (seen above) used to need:

packages:
  all:
    compiler: [gcc@=14.2.0]

but that now generates warnings from Spack like:

==> Warning: The packages:all:compiler preference has been deprecated in Spack v1.0, and is currently ignored. It will be removed from config in Spack v1.2.

Moreover, Spack used to use a file called ~/.spack/linux/compilers.yaml to manage compilers. That is now defunct. Compiler information is controlled in packages.yaml now.

Compiler and MPI

As seen above, the packages.yaml file is used to configure the packages that are available to spack.

In this example, we are telling it we want:

  • GNU Fortran as our Fortran compiler
  • Open MPI as our MPI stack
  • Intel MKL as our BLAS and LAPACK stack
Supported compilers and MPI stacks

GEOSgcm is not required to use these specific compilers. We use GCC 14 here as a reliable default. GEOSgcm supports for Fortran compilers:

  • GCC 13+
  • Intel Fortran Classic (ifort) 2021.6+
  • Intel OneAPI Fortran (ifx) 2025.0+

And MPI stacks that have been tested are:

  • Open MPI
  • Intel MPI
  • MPICH
  • MPT

It's possible MVAPICH2 might also work, but issues have been seen when using it on Discover, but time has not been invested to determine the actual issue.

NOTE 1: This version of GEOSgcm has a requirement for Intel MKL for the BLAS and LAPACK. This is being worked on, but for now it is required.

NOTE 2: GEOSgcm does not have a strong dependence on the C and C++ compiler, we mainly focus on Fortran compilers in our testing.

Possible changes for Open MPI and SLURM

Some testing with Open MPI as found that the following variants might be useful or even need for packages.yaml on systems using SLURM. For example using Open MPI 4.1:

  openmpi:
    require:
    - "@4.1.7"
    - "%gcc@14.2.0"
    buildable: True
    variants: +legacylaunchers +internal-hwloc +internal-libevent +internal-pmix schedulers=slurm fabrics=ucx
  slurm:
    buildable: False
    externals:
    - spec: slurm@23.11.10
      prefix: /usr
  ucx:
    buildable: False
    externals:
    - spec: ucx@1.18.0
      prefix: /usr

NOTE 1: You will probably want to change the SLURM and UCX external specs to match the version of SLURM on your system.

NOTE 2: Testing has found that the system UCX is about the only "good" way we have found for Open MPI to see things like Infiniband interfaces (e.g., mlx5_0).

repos.yaml

We need an additional repo that has the recipe package.py file for GEOSgcm.

First clone the repository with:

git clone https://github.com/GMAO-SI-Team/geosesm-spack.git

Now, assuming that was in the home directory, you can add it to your repos.yaml file with:

spack repo add $HOME/geosesm-spack/spack_repo/geosesm

This should result in a repos.yaml file that looks like:

repos:
- /home/ubuntu/geosesm-spack/spack_repo/geosesm

where, of course, the path for your $HOME might be different.

Updating geosesm-spack

It's possible changes might be made to the geosesm-spack repo as time goes on. If updates are needed, you can update the repo with:

cd $HOME/geosesm-spack
git pull

Install GCC 14.2.0

With Spack

To install GCC 14.2.0 with Spack run:

spack install gcc@14.2.0

You then also need to tell Spack where the compiler is with:

spack compiler find $(spack location -i gcc@14.2.0)

When you do that, you'll see it in the ~/.spack/packages.yaml file.

With system package manager

If your system has GCC 14.2.0, you can instead just install it via your system package manager. Then you can run:

spack compiler find

might find it. You can check if it found it by looking at ~/.spack/packages.yaml and there should be entries for gcc

Install GEOSgcm

You can now install GEOSgcm with:

spack install geosgcm@12.0.0-rc.2 %gcc@14.2.0

On a test install, this built about 150 packages.

Running GEOSgcm

Getting HugeBCs

HugeBCs is a portable set of boundary conditions for the GEOSgcm model. In the example below, we will be using the c1440 resolution.

You can get HugeBCs from the following link:

NEED A LINK HERE

Creating an experiment

First, change to the install directory of GEOSgcm:

cd $(spack location -i geosgcm %gcc@14.2.0)/bin

We will be setting up an experiment using the create_expt.py script in HugeBCs.

In the following example, the name of the experiment will be test-c1440:

/path/to/HugeBCs-GitV12/scripts/create_expt.py test-c1440 \
   --expdir /path/to/experiment/directory \
   --horz c1440 --vert 181 --heartbeat 300 \
   --ocean CS --landbcs NL3 --emission OPS --nooserver

This sets up the experiment in the directory /path/to/experiment/directory/test-c1440. The other options say:

  • --horz c1440: use c1440 horizontal resolution
  • --vert 181: use 181 vertical levels
  • --heartbeat 300: set the heartbeat to 300 seconds
  • --ocean CS: use the cubed-sphere ocean
  • --landbcs NL3: use the NL3 land boundary conditions
  • --emission OPS: use the OPS emissions
  • --nooserver: do not use the OS server (as we will not be outputting any data)

Setting up the experiment

Next change to the experiment directory:

cd /path/to/experiment/directory/test-c1440

And now we will use the makeoneday.bash script to set up the experiment:

/path/to/HugeBCs-GitV12/scripts/makeoneday.bash link v12bench 10dy

This tells the experiment to:

  • link: link restarts into the scratch/ directory
  • v12bench: use the v12 benchmark settings
  • 10dy: run for 10 days

Running the experiment

When all is done, you will have a script called gcm_run.j that runs the experiment. So with slurm you'd do:

sbatch gcm_run.j

But, you will need to edit the top of the script to have it know about the spack environment and perhaps to get the correct #SBATCH pragmas.

Updates for Spack (required)

By default, the gcm_run.j script will not know about the spack environment. You need to add the following lines to the top of the script:

limit stacksize unlimited
limit coredumpsize 0

source /path/to/spack/share/spack/setup-env.csh
spack load geosgcm %gcc@14.2.0

where the latter two lines are the ones you need to add. Note: gcm_run.j is a csh script, so you need to source the setup-env.csh file (even if your working shell is bash).

This will ensure that the geosgcm package is loaded and the libraries, etc. are found.

Changing the number of cores used

By default, a c1440 experiment will use 38400 cores which corresponds to in AGCM.rc

  • NX: 80
  • NY: 480

and in your gcm_run.j script, you will see:

#SBATCH --ntasks=38400

On NCCS machines this will be:

#SBATCH --nodes=320 --ntasks-per-node=120

You can change the number of cores used by changing the NX and NY values in AGCM.rc and the --ntasks value in the gcm_run.j script. But, as detailed in AGCM.rc:

# Some rules of thumb when changing processor counts:
#
# 1) Start from the NX/NY that come from gcm_setup
# 2) The gcm will not run with AGCM_IM/NX or AGCM_JM/NY/6 less than or equal to 3
# 3) if you want more cores, double NY (for 2x), then double NX(for 4x) and so on...
# 4) if you want fewer cores, half NX (for 0.5), then half NY for (0.25) and so on...
# 5) when you hit an odd number we need to get more creative

Clone this wiki locally