ACCESS_CM2_AMIPconfigTests_Implementation

Background

Original request was to help implement #132 by

Setting up multiple test configurations
Add several code changes to the coupling interface
Re-run the test cases so the effects of the code change can be assessed

Jhan and Mark have added some of the test configurations at wiki:ACCESS-CM2/AMIPconfigTests

These configurations are related to #176, a bug in soil properties

It appears to be generally desirable to have a testing setup for CABLE that can run different configurations of the model and compare the results against control runs in an automated fashion.

Mark has some analysis scripts that will be used as an initial implementation. The UKMO Afterburner tool was considered but is not doing the calculations the CABLE team need

We need:

A way to run multiple configurations of the model
A post-processing stage to convert the model output to netcdf & move to a location for further processing
Integration with Mark's scripts, so that they can read the files generated by the post-processing stage

I am assuming that we are mainly interested in the ACCESS 2 AMIP configuration for the moment

Prototyping

Running multiple configurations

The configurations presented at wiki:ACCESS-CM2/AMIPconfigTests are presented as variations of a default configuration. This brings immediately to mind Rose's 'opt configs'. These act as override files, and add modifications to a common base configuration.

In the CABLE case, this could work by using an existing CABLE rose app as the base configuration, then in the directory 'app/um/opt/' create opt configs for each test case, like

# app/um/opt/rose-app-test_alpha.conf
[namelist:cable]
soil_struc=default
fwsoil=Haverd2013
gs_switch=medlyn
litter=.true.
rev_corr=.true.

# app/um/opt/rose-app-test_beta.conf
[namelist:cable]
soil_struc=default
gw_model=true
or_evap=true
gs_switch=medlyn
fwsoil=Haverd2013
litter=.true.
rev_corr=.true.

We now have configurations for each test case, and since these are small (everything not in the 'opt' file remains the same as it is in the base configuration) it is easy to see what each test is changing.

To make use of these 'opt configs' we can use Cylc's 'parameterised task' setup, that allows a task to be run multiple times with minor changes to the suite setup.

Say we define a parameter 'test_case' like so:

[cylc]
    [[parameters]]
         test_case = test_alpha test_beta

[scheduling]
    [[dependencies]]
        [[[R1]]]
            graph # "fcm_make> recon => atmos_main<test_case>
        [[[[RESUB]]]]
            graph # "atmos_main<test_case>[-[RESUB]]> atmos_main<test_case>

There are now two separate 'atmos' jobs with identical graphs. The advantage is that more tests can be added by modifying the parameter definition, without having to look at the dependency graph any more.

In order to get the parameterised tasks to load the correct model configuration we add the parameter name to the environment variable '$ROSE_APP_OPT_CONF_KEYS' like so:

[runtime]
    [[atmos_main<test_case>]]
        [[[environment]]]
            ROSE_APP_OPT_CONF_KEYS = ${CYLC_TASK_PARAM_test_case} ${ROSE_APP_OPT_CONF_KEYS}

The environment variable '${CYLC_TASK_PARAM_test_case}' gets expanded to the current parameter value, so the 'test_alpha' variant of 'atmos' will load the 'test_alpha' Rose opt config.

Summary

Create 'opt config' files which specify the configuration changes to test
Add these test configs to Cylc using parameterised tasks

Post-processing stage

To post-process the model outputs after each model CRUN we can create a new app, that just runs a simple script. If the model configuration option 'l_postp' is set to 'True', once the model has finished writing to an output file (which may take multiple CRUNS in the case of e.g. seasonal averages) the model will create an empty file '$OUTPUT.arch' showing that that file is safe to post-process UMDP Y01 Sec. 2.2.

A sample script that looks for these files, then converts them to NetCDF using the Iris library is

#!/bin/bash
# app/nci_archive/bin/nci_archive.sh
# Post-processing script using iris2netcdf
set -eu

module use /g/data3/hh5/public/modules  
module load parallel
module load conda/analysis3

# Path to '.arch' files
ARCH_PATH="${SUITE_WORK_DIR}/${CYCLE}/${TASK}"

# Names of files we need to archive
ARCH_FILES=$(find ${ARCH_PATH} -size 0 -type f -name \*.arch -printf '%f\n' | sed 's/\.arch$//')

mkdir -p "${OUTPUT_DIR}"

parallel --verbose --jobs ${PBS_NCPUS:-2} iris2netcdf "${INPUT_DIR}/{}" -o "${OUTPUT_DIR}/{}.nc" ::: ${ARCH_FILES}

Note the use of gnu parallel to run multiple conversions simultaneously. Some mucking about with paths is required because the '.arch' files are kept in the Cylc work directory, while the actual output files are kept under the data directory.

The app config for this job is:

# app/nci_archive/rose-app.conf
[command]
default=nci_archive.sh
[env]
SUITE_WORK_DIR = ${CYLC_SUITE_WORK_DIR}
TASK = atmos_main_${CYLC_TASK_PARAM_test_case}
CYCLE = ${CYLC_TASK_CYCLE_POINT}
INPUT_DIR = ${DATAM}
OUTPUT_DIR = ${ROSE_DATA}/archive/${CYLC_TASK_PARAM_test_case}

The exact conversion script, as well as the correct output location for the processed files, will depend on what the analysis scripts need.

The Cylc implementation is pretty simple, it also makes use of the parameterised tasks:


[scheduling]
    [[dependencies]]
        [[[[RESUB]]]]
            graph # "atmos_main<test_case>> nci_archive<test_case>"
[runtime]
    [[nci_archive<test_case>]]
        inherit = HPC
        [[[directives]]]
            -l ncpus = 4
            -l mem = 8gb

Summary

The exact conversion will depend on the requirements of the analysis pipeline
Files to process can be spotted using the '*.arch' files

Current Status

Need a base configuration to apply these to. Possible jobs are:
- u-ap747: Jhan's CABLE AMIP run
- u-ar180, u-ar560: Mark's CABLE AMIP runs, based off of u-ap747 with unspecified modifications & currently not working
Need the analysis scripts to run
- Mark has scripts to: evaluate annual and seasonal mean precipitation, terrestrial ET, long and short wave radiation at the surface (land), cloud fractions at low, mid, and high levels, 2m temperature, max 2m temperature, and min 2m temperature
- Unknown where these are
The configuration settings given by Mark and Ian have different names to the namelists used by Jhan's run - Have the variable names been changed over time?

Meeting 20180209

We will start from Jhan's AMIP run u-ap747
- There was a question about the vegetation parameters being different from those used in the coupled run - will this need to be corrected?
Martin will add in the STASH settings so that the CABLE variables are output, and add in a post-processing step to convert the output to netcdf
Mark will send his analysis scripts to Scott
Scott will integrate the output from the model with the analysis scripts
We will worry about the different test scenarios once the base configuration is working
The output from the analysis will go into /g/data/p66/accessdev-web, so that they can be seen from the web server
- This will include the namelist parameters etc. that were used for that run

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ACCESS_CM2_AMIPconfigTests_Implementation

Background

Prototyping

Running multiple configurations

Summary

Post-processing stage

Summary

Current Status

Meeting 20180209

Clone this wiki locally