Skip to content

Set up PBS workflow #22

@bschroeter

Description

@bschroeter

meorg_client needs to operate inside an internet-accessible environment. This means that we need to run it on the copyq, however, benchcab itself runs on a compute node. As such, we need to chain a series of PBS jobs to achieve the level of desired functionality.

The proposed workflow is as follows:

  1. [JOB 1, compute] Benchcab runs, writes output files, triggers an meorg_client job on the copyq.
  2. [JOB 2, copyq] meorg_client uploads the files to the server, noting the JOB_ID of each file, which is used to query the transfer to the object store. A subsequent job is triggered (Job 3) at a computed interval of 5mins + 150mbit/sec for the total data transfer + 10%.
  3. [JOB 3, copyq] meorg_client queries the JOB_IDs to get the true FILE_ID that is then used to attach the files to the model outputs. Once successful, meorg_client triggers the analysis.

Depending on the notification capability of the server, there may be an optional 4th job to query the status of the analysis and alert the user to the outcome and/or email a link to the plots.

There is a minimum of 3 PBS jobs required (1 compute + 2 copyq), unless we allow the copyq job to run for longer and combine the meorg steps into a single job. This may not be an acceptable use of resources.

This may be a good time to work on the Python implementation of handling PBS jobs as the logic may become cumbersome in vanilla shell.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions