-
Notifications
You must be signed in to change notification settings - Fork 8
Getting Started
The following are required to compile and run Hybrid-GODAS
- rocoto workflow manager
- cmake
- climate data operators (cdo)
- NetCDF operators (nco)
- NetCDF4 (with Fortran library)
- openmpi or intelmpi
- fortran compiler
Additionally, the following may be required for some of the preparation scripts (observation and forcing downloading and prep):
- Python 3 with modules for NetCDF4, pygrib
The following instructions will enable you to download the code needed andcompile, assuming you are running on the Gaea supercomputer.
The following commands will download the hybrid-GODAS repository as well as any other child repositories that it requires:
git clone https://github.com/UMD-AOSC/hybrid-godas.git
cd hybrid-godas
git submodule update --init --recursive
Setup the environment by creating or linking a ./config/env
file. For example, on Gaea use the already made configuration file by running:
ln -s env.gaea config/env
within the root directory, compile the MOM6 model and the data assimilation code. The various components (fms, mom, gsw, datetime, obsop, util, 3dvar, letkf) can be compiled separately, but for convenience use the following:
make model
make da
download or link the static files required for MOM6. On gaea these are already available and can be linked to with
ln -s /lustre/f1/pdata/gfdl_O/datasets src/MOM6/.datasets
There are several other preprocessed historical datasets (surface forcing, observations,...) needed by the system that can be linked to from my personal directory:
mkdir DATA
ln -s /lustre/f1/unswept/ncep/Travis.Sluka/hybrid-godas-DATA/* DATA/
The following will get you running a short cycling experiment with data assimilation using the default settings. Note: the default settings use initial conditions from 2004-01-01 generated by a 1 year 20 ensemble member run. If you do not want to use these initial conditions, and instead want to start from Levitus T/S climatology, simply delete the DATA/exp1/cycle
directory.
Initialize a new experiment directory by running from the root directory:
./run/init_cycle.sh DATA/exp1
cd DATA/exp1
This is the directory where all the experiment specific configuration, logs, and results are stored. You should see several files/directories of importance, which have been initialized with default configuration values:
-
config/
- All the configuration files needed for running the model (config/mom
), doing data assimilation (config/da
) as well as the master configuration file that controls everything (config/hybridgodas.config
) -
cycle/
all the files required for starting the next cycle are placed here. This includes the rocoto configuration files, the restart files from the previous cycle, the analysis increment from the previous data assimilation set. -
hybridgodas.rocoto
- A convenience wrapper for running any of the rocoto commands. -
hybridgodas.run
- The main run script used to cycle rocoto for the running of your experiment. This generates an xml file for rocoto based on yourconfig/hybridgodas.config
configuration file, and submits it to rocoto. Rocoto then submits it to the system's job management system. -
hybridgodas.status
- A convenience script to view the status of the rocoto cycles. -
version
- The version of the source code repositories as it was when you initialized your experiments.
Edit the config/hybridgodas.config
file paying attention to the following:
-
SCHED_ACCT
should be changed to the computing account you have access to on Gaea -
CYCLE_END
should be set to something shorter, say2003010500
if you want to just run a single 5 day cycle. These dates are inYYYYMMDDHH
format. - several directory locations were automatically filled out, make sure they are correct (
ROOT_DIR
,EXP_DIR
,WORK_DIR
,FORC_MEAN_FILE
, etc) -
ENS_SIZE
is the number of ensemble members. Set this to something small, perhaps5
. -
DA_MODE
is the data assimilation mode the system will run in. Set this to"hyb"
for a full hybrid-DA run.
Start the experiment. From the main experiment directory run:
./hybridgodas.run --cycle 2
This will cause several things to happen. 1) An xml file is generated and placed at cycle/rocoto/hybridgodas.rocoto.xml
. The contents of this file depends on exactly how your experiment was configured. 2) rocotorun
is called to submit jobs to the system's jobs manager. 3) this is repeated every 2 minutes, or whatever argument you pass after --cycle
, until all the cycles up to CYCLE_END
have finished.
I find calling ./hybridgodas.run --cycle 2
easy for running experiments, but it does require you to leave your window open (or have a window open in something like tmux
or screen
). Alternatively you could setup a cronjob to run hybridgodas.run
every couple of minutes.
In a separate window you can run ./hybridgodas.status
to look at the latest report of the experiments status.
Assuming everything above went well, you'll see several new directories in your experiment directory.
-
log/
all logs from the job steps are places here -
output/
all of the final output from the experiment is placed here. By default compression is done before saving the files here to dramatically shrink their file sizes. But this can be controlled in thehybridgodas.config
file.
The exact contents of output/
depends on what type of data assimilation is run. But for a hybrid-DA run you'll see
-
ana/{mean, sprd}
the mean and spread of the analysis at the end of the data assimilation cycle -
ana/mean_letkf
- the intermediate analysis mean after the LETKF step. The workflow is setup so that the LETKF is performed first, and then the 3DVar uses this analysis mean as its background before it produces the final analysis mean. This is here for diagnostic purposes only and is not what is actually used to restart the models. -
bkg/{mean, sprd}
the background ensemble mean and spread -
bkg_diag/
The full diagnostics from the control forecast (initialized from the analysis mean). The previous mentionedana
andbkg
files currently only save the state variables required by the data assimilation (U,V,T,S) but the diagnostic fields here contain everything output by the model. Currently this is the pentad average files. -
omf/
the observation minus forecast (O-F) files that are used by the LETKF.
take note that all filenames that have dates are (for the most part) using the analysis time. So the 5 day pentad average files from Jan 1 - Jan 5 will be listed using 2003010600 since the very end of the period is the analysis time, 2003 Jan 6, 00Z.