-
Notifications
You must be signed in to change notification settings - Fork 0
ACCESS_CM2_AMIPconfigTests_Implementation
Original request was to help implement #132 by
- Setting up multiple test configurations
- Add several code changes to the coupling interface
- Re-run the test cases so the effects of the code change can be assessed
Jhan and Mark have added some of the test configurations at wiki:ACCESS-CM2/AMIPconfigTests
These configurations are related to #176, a bug in soil properties
It appears to be generally desirable to have a testing setup for CABLE that can run different configurations of the model and compare the results against control runs in an automated fashion.
Mark has some analysis scripts that will be used as an initial implementation. The UKMO Afterburner tool was considered but is not doing the calculations the CABLE team need
We need:
- A way to run multiple configurations of the model
- A post-processing stage to convert the model output to netcdf & move to a location for further processing
- Integration with Mark's scripts, so that they can read the files generated by the post-processing stage
I am assuming that we are mainly interested in the ACCESS 2 AMIP configuration for the moment
The configurations presented at wiki:ACCESS-CM2/AMIPconfigTests are presented as variations of a default configuration. This brings immediately to mind Rose's 'opt configs'. These act as override files, and add modifications to a common base configuration.
In the CABLE case, this could work by using an existing CABLE rose app as the base configuration, then in the directory 'app/um/opt/' create opt configs for each test case, like
# app/um/opt/rose-app-test_alpha.conf
[namelist:cable]
soil_struc=default
fwsoil=Haverd2013
gs_switch=medlyn
litter=.true.
rev_corr=.true.
# app/um/opt/rose-app-test_beta.conf
[namelist:cable]
soil_struc=default
gw_model=true
or_evap=true
gs_switch=medlyn
fwsoil=Haverd2013
litter=.true.
rev_corr=.true.
We now have configurations for each test case, and since these are small (everything not in the 'opt' file remains the same as it is in the base configuration) it is easy to see what each test is changing.
To make use of these 'opt configs' we can use Cylc's 'parameterised task' setup, that allows a task to be run multiple times with minor changes to the suite setup.
Say we define a parameter 'test_case' like so:
[cylc]
[[parameters]]
test_case = test_alpha test_beta
[scheduling]
[[dependencies]]
[[[R1]]]
graph # "fcm_make> recon => atmos_main<test_case>
[[[[RESUB]]]]
graph # "atmos_main<test_case>[-[RESUB]]> atmos_main<test_case>
There are now two separate 'atmos' jobs with identical graphs. The advantage is that more tests can be added by modifying the parameter definition, without having to look at the dependency graph any more.
In order to get the parameterised tasks to load the correct model configuration we add the parameter name to the environment variable '$ROSE_APP_OPT_CONF_KEYS' like so:
[runtime]
[[atmos_main<test_case>]]
[[[environment]]]
ROSE_APP_OPT_CONF_KEYS = ${CYLC_TASK_PARAM_test_case} ${ROSE_APP_OPT_CONF_KEYS}
The environment variable '${CYLC_TASK_PARAM_test_case}' gets expanded to the current parameter value, so the 'test_alpha' variant of 'atmos' will load the 'test_alpha' Rose opt config.
- Create 'opt config' files which specify the configuration changes to test
- Add these test configs to Cylc using parameterised tasks
To post-process the model outputs after each model CRUN we can create a new app, that just runs a simple script. If the model configuration option 'l_postp' is set to 'True', once the model has finished writing to an output file (which may take multiple CRUNS in the case of e.g. seasonal averages) the model will create an empty file '$OUTPUT.arch' showing that that file is safe to post-process UMDP Y01 Sec. 2.2.
A sample script that looks for these files, then converts them to NetCDF using the Iris library is
#!/bin/bash
# app/nci_archive/bin/nci_archive.sh
# Post-processing script using iris2netcdf
set -eu
module use /g/data3/hh5/public/modules
module load parallel
module load conda/analysis3
# Path to '.arch' files
ARCH_PATH="${SUITE_WORK_DIR}/${CYCLE}/${TASK}"
# Names of files we need to archive
ARCH_FILES=$(find ${ARCH_PATH} -size 0 -type f -name \*.arch -printf '%f\n' | sed 's/\.arch$//')
mkdir -p "${OUTPUT_DIR}"
parallel --verbose --jobs ${PBS_NCPUS:-2} iris2netcdf "${INPUT_DIR}/{}" -o "${OUTPUT_DIR}/{}.nc" ::: ${ARCH_FILES}
Note the use of gnu parallel to run multiple conversions simultaneously. Some mucking about with paths is required because the '.arch' files are kept in the Cylc work directory, while the actual output files are kept under the data directory.
The app config for this job is:
# app/nci_archive/rose-app.conf
[command]
default=nci_archive.sh
[env]
SUITE_WORK_DIR = ${CYLC_SUITE_WORK_DIR}
TASK = atmos_main_${CYLC_TASK_PARAM_test_case}
CYCLE = ${CYLC_TASK_CYCLE_POINT}
INPUT_DIR = ${DATAM}
OUTPUT_DIR = ${ROSE_DATA}/archive/${CYLC_TASK_PARAM_test_case}
The exact conversion script, as well as the correct output location for the processed files, will depend on what the analysis scripts need.
The Cylc implementation is pretty simple, it also makes use of the parameterised tasks:
[scheduling]
[[dependencies]]
[[[[RESUB]]]]
graph # "atmos_main<test_case>> nci_archive<test_case>"
[runtime]
[[nci_archive<test_case>]]
inherit = HPC
[[[directives]]]
-l ncpus = 4
-l mem = 8gb
- The exact conversion will depend on the requirements of the analysis pipeline
- Files to process can be spotted using the '*.arch' files
-
Need a base configuration to apply these to. Possible jobs are:
- u-ap747: Jhan's CABLE AMIP run
- u-ar180, u-ar560: Mark's CABLE AMIP runs, based off of u-ap747 with unspecified modifications & currently not working
-
Need the analysis scripts to run
- Mark has scripts to: evaluate annual and seasonal mean precipitation, terrestrial ET, long and short wave radiation at the surface (land), cloud fractions at low, mid, and high levels, 2m temperature, max 2m temperature, and min 2m temperature
- Unknown where these are
-
The configuration settings given by Mark and Ian have different names to the namelists used by Jhan's run - Have the variable names been changed over time?
-
We will start from Jhan's AMIP run u-ap747
- There was a question about the vegetation parameters being different from those used in the coupled run - will this need to be corrected?
-
Martin will add in the STASH settings so that the CABLE variables are output, and add in a post-processing step to convert the output to netcdf
-
Mark will send his analysis scripts to Scott
-
Scott will integrate the output from the model with the analysis scripts
-
We will worry about the different test scenarios once the base configuration is working
-
The output from the analysis will go into /g/data/p66/accessdev-web, so that they can be seen from the web server
- This will include the namelist parameters etc. that were used for that run