The Connectivity Modeling System (CMS) has been developed to study complex larval migrations and give probability estimates of population connectivity. The CMS can also passively track virtual particles that passively advected by the velocity fields.
Since 2014, I have been using CMS to quantify Agulhas Leakage, by seeding particles in the Agulhas Current jet and track the number of particles that end up on the other side of the GoodHope line and the timing of such crossings. By summing up the crossing particles at each time step, we create a time series of Agulhas leakage. More details can be found in my recent paper.
As the coupled model keeps generating more outputs, running CMS like before becomes less practical. The CMS was not designed to track particles at such scale for such a long period of time (multiple decades). Also, I ran into some issues when I tried submitting CMS jobs to the UM cluster -- a continuous job cannot complete within the wall time and memory limits. So, I came up with a walk-around to divide the multi-decade-long job into several smaller chunks, ensuring that such jobs can complete successfully. Moreover, by doing that, I can easily extend the leakage time-series without repeating the previous years.
One day, folks from Center for Computational Science (CCS) told me that I was suspended from submitting more CMS jobs because such jobs drained the system memory and significantly dragged down the performance of the cluster. Some staffs helped me to test run CMS on an isolated filesystem, and we eventually identified that the vast part of memory usage was caused by outputting as NetCDF files. So they advised all CMS users on the cluster to set the output format to ASCII.
Changing output to ASCII reduces the required time for a 5-year chunk tracking 600 thousand particles from 12hrs to less than 30mins. However, that also renders my old post-processing scripts useless. This repo documented some of the changes I made.
gen_hrc07_release_chunks.pyandmultiple_gen.pyare used to generate releasefiles and their corresponding volume_tag files.multiple_gen.pycallsgen_hrc07_release_chunks.pyas a function to generate releasefiles in five-year chunks.changeending.pycan add.txtto thetraj_file_xxin theexpt_name/outputfolder. Alternatively, one can modify the source code of CMS by adding//".txt"tooutput.f90line 55-57. This change allows the matlab function tabularTextDatastore (available after 2016a) to detect these ascii files.traj_proc_update.mis the main program, calling two functionscms_ascii_postprocanddailyload_core_voltag.proc_subis a sample LSF job submit script.jul2greg.mis used incms_ascii_postprocto change the original releasedate to gregorian days (from internet).chunk_traj_proc.pycopies, renames and editstraj_proc_update.mandproc_subfor several chunks at once.