|
| 1 | +================= |
| 2 | +Process generator |
| 3 | +================= |
| 4 | + |
| 5 | +Experimental feature and unofficial extension to the CWL standards. |
| 6 | + |
| 7 | +A process generator is a CWL Process type that executes a concrete CWL |
| 8 | +process (CommandLineTool, Workflow or ExpressionTool) which produces |
| 9 | +CWL files as output, then executes the CWL that was generated. |
| 10 | + |
| 11 | +The intention is to have a formalized way to express a pre-processing |
| 12 | +or bootstrapping step in which a CWL description is generated by |
| 13 | +another program (such as from a template, or conversion from another |
| 14 | +workflow language). |
| 15 | + |
| 16 | +The ProcessGenerator is a subtype of CWL process, so it must define |
| 17 | +its inputs and outputs. The "run" field is similar to the "run" field |
| 18 | +of a workflow step -- it specifies a tool to run that will create new |
| 19 | +CWL as output. |
| 20 | + |
| 21 | +.. code:: yaml |
| 22 | +
|
| 23 | + - name: ProcessGenerator |
| 24 | + type: record |
| 25 | + inVocab: true |
| 26 | + extends: cwl:Process |
| 27 | + documentRoot: true |
| 28 | + fields: |
| 29 | + - name: class |
| 30 | + jsonldPredicate: |
| 31 | + "_id": "@type" |
| 32 | + "_type": "@vocab" |
| 33 | + type: string |
| 34 | + - name: run |
| 35 | + type: [string, cwl:Process] |
| 36 | + jsonldPredicate: |
| 37 | + _id: "cwl:run" |
| 38 | + _type: "@id" |
| 39 | + subscope: run |
| 40 | + doc: | |
| 41 | + Specifies the process to run. |
| 42 | +
|
| 43 | +
|
| 44 | +Process generator example (pytoolgen.cwl) |
| 45 | + |
| 46 | +.. code:: yaml |
| 47 | +
|
| 48 | + #!/usr/bin/env cwl-runner |
| 49 | + cwlVersion: v1.0 |
| 50 | + $namespaces: |
| 51 | + cwltool: "http://commonwl.org/cwltool#" |
| 52 | + class: cwltool:ProcessGenerator |
| 53 | + inputs: |
| 54 | + script: string |
| 55 | + dir: Directory |
| 56 | + outputs: {} |
| 57 | + run: |
| 58 | + class: CommandLineTool |
| 59 | + inputs: |
| 60 | + script: string |
| 61 | + dir: Directory |
| 62 | + outputs: |
| 63 | + runProcess: |
| 64 | + type: File |
| 65 | + outputBinding: |
| 66 | + glob: main.cwl |
| 67 | + requirements: |
| 68 | + InlineJavascriptRequirement: {} |
| 69 | + cwltool:LoadListingRequirement: |
| 70 | + loadListing: shallow_listing |
| 71 | + InitialWorkDirRequirement: |
| 72 | + listing: | |
| 73 | + ${ |
| 74 | + var v = inputs.dir.listing; |
| 75 | + v.push({entryname: "inp.py", entry: inputs.script}); |
| 76 | + return v; |
| 77 | + } |
| 78 | + arguments: [python, inp.py] |
| 79 | + stdout: main.cwl |
| 80 | +
|
| 81 | +
|
| 82 | +The process generator has two required inputs: "script" and "dir". It |
| 83 | +runs the command line tool listed inline in "run" with the input |
| 84 | +object, which is required to have those parameters. Note: the input |
| 85 | +object may contain additional parameters which are intended for the |
| 86 | +generated CWL when it is executed. |
| 87 | + |
| 88 | +The command line tool populates the working directory using |
| 89 | +InitialWorkDirRequirement. It uses the listing from 'dir' and adds a |
| 90 | +new file literal called "inp.py" which contains the text from the |
| 91 | +input parameter "script". Then it runs "python inp.py". |
| 92 | + |
| 93 | +The output of this command line tool is the File parameter |
| 94 | +"runProcess". In this example, the "inp.py" script, when run, is |
| 95 | +expected to print the CWL description to standard output, which will |
| 96 | +be captured in the "runProcess" output parameter. |
| 97 | + |
| 98 | +Next, the ProcessGenerator will load file in the "runProcess" |
| 99 | +parameter, which in this example is "main.cwl". Finally, it will |
| 100 | +execute the process with input object that was originally provided to |
| 101 | +the process generator. |
| 102 | + |
| 103 | +The output of the generated script is used as the output for |
| 104 | +ProcessGenerator as a whole. |
| 105 | + |
| 106 | + |
| 107 | +Here's an example (zing.cwl) that uses pytoolgen.cwl. |
| 108 | + |
| 109 | +.. code:: yaml |
| 110 | +
|
| 111 | + #!/usr/bin/env cwltool |
| 112 | + {cwl:tool: pytoolgen.cwl, script: {$include: "#attachment-1"}, dir: {class: Directory, location: .}} |
| 113 | + --- | |
| 114 | + import os |
| 115 | + import sys |
| 116 | + print(""" |
| 117 | + cwlVersion: v1.0 |
| 118 | + class: CommandLineTool |
| 119 | + inputs: |
| 120 | + zing: string |
| 121 | + outputs: {} |
| 122 | + arguments: [echo, $(inputs.zing)] |
| 123 | + """) |
| 124 | +
|
| 125 | +The first line ``#!/usr/bin/env cwltool`` means that this file can be |
| 126 | +given the executable bit (+x) and then run directly. |
| 127 | +
|
| 128 | +This is a multi-part YAML file. The first section is a CWL input |
| 129 | +object. |
| 130 | +
|
| 131 | +The input object uses "cwl:tool" to indicate that this input object |
| 132 | +should be used as input to execute "pytoolgen.cwl". |
| 133 | + |
| 134 | +The parameter ``script: {$include: "#attachment-1"}`` takes the text |
| 135 | +from the second part of the file (following the YAML division marker |
| 136 | +``--- |``) and assigns it as a string value to "script". |
| 137 | + |
| 138 | +The "dir" parameter is not doing much in this example, but by |
| 139 | +capturing the whole directory it allows the Python script to refer to |
| 140 | +files in the current directory. |
| 141 | + |
| 142 | +In this example the script is trivially printing CWL as a string, but |
| 143 | +of course could do something much more complex: generate code from a |
| 144 | +template, select among several possible workflows based on the input, |
| 145 | +convert from another workflow language, etc. |
| 146 | + |
| 147 | +When this is executed, the following steps happen: |
| 148 | + |
| 149 | +#. pytoolgen.py is loaded and executed with the 1st part of the file as the input object |
| 150 | + |
| 151 | +#. The "script" parameter contains the contents of the second part. |
| 152 | + The inline command line tool creates a file called "inp.py" with |
| 153 | + the contents of "script" |
| 154 | + |
| 155 | +#. The inline command line tool runs python on "inp.py" and collects |
| 156 | + the output, which is CWL description for a trivial "echo" tool. |
| 157 | + |
| 158 | +#. It loads the CWL description and executes it with any additional |
| 159 | + parameters declared in the input object or command line. |
| 160 | + |
| 161 | + |
| 162 | +Example runs |
| 163 | +------------ |
| 164 | + |
| 165 | +Note: requires ``cwltool`` flags ``--enable-ext`` and ``--enable-dev`` |
| 166 | + |
| 167 | +You can set these with the environment parameter CWLTOOL_OPTIONS |
| 168 | + |
| 169 | +.. code:: |
| 170 | +
|
| 171 | + $ export CWLTOOL_OPTIONS="--enable-dev --enable-ext" |
| 172 | +
|
| 173 | + $ ./zing.cwl |
| 174 | + INFO /home/peter/work/cwltool/venv3/bin/cwltool 3.1.20211112163758 |
| 175 | + INFO Resolved './zing.cwl' to 'file:///home/peter/work/cwltool/tests/wf/generator/zing.cwl' |
| 176 | + INFO [job d3626216-d7d8-4322-bc21-4d469634cc9a] /tmp/8sez90gb$ python \ |
| 177 | + inp.py > /tmp/8sez90gb/main.cwl |
| 178 | + INFO [job d3626216-d7d8-4322-bc21-4d469634cc9a] completed success |
| 179 | + usage: ./zing.cwl [-h] --zing ZING [job_order] |
| 180 | + ./zing.cwl: error: the following arguments are required: --zing |
| 181 | +
|
| 182 | +
|
| 183 | +.. code:: |
| 184 | +
|
| 185 | + $ ./zing.cwl --zing blurf |
| 186 | + INFO /home/peter/work/cwltool/venv3/bin/cwltool 3.1.20211112163758 |
| 187 | + INFO Resolved './zing.cwl' to 'file:///home/peter/work/cwltool/tests/wf/generator/zing.cwl' |
| 188 | + INFO [job a580b69d-2b88-4268-904e-ed105ba7c85e] /tmp/ujff239o$ python \ |
| 189 | + inp.py > /tmp/ujff239o/main.cwl |
| 190 | + INFO [job a580b69d-2b88-4268-904e-ed105ba7c85e] completed success |
| 191 | + INFO [job main.cwl] /tmp/f_7bxncq$ echo \ |
| 192 | + blurf |
| 193 | + blurf |
| 194 | + INFO [job main.cwl] completed success |
| 195 | + { |
| 196 | + "runProcess": { |
| 197 | + "location": "file:///home/peter/work/cwltool/tests/wf/generator/main.cwl", |
| 198 | + "basename": "main.cwl", |
| 199 | + "class": "File", |
| 200 | + "checksum": "sha1$8c160b680fb2cededef3228a53425e595b8cdf48", |
| 201 | + "size": 111, |
| 202 | + "path": "/home/peter/work/cwltool/tests/wf/generator/main.cwl" |
| 203 | + } |
| 204 | + } |
| 205 | + INFO Final process status is success |
| 206 | +
|
| 207 | +.. code:: |
| 208 | +
|
| 209 | + $ echo "zing: zoop" > job.yml |
| 210 | + $ ./zing.cwl job.yml |
| 211 | + INFO /home/peter/work/cwltool/venv3/bin/cwltool 3.1.20211112163758 |
| 212 | + INFO Resolved './zing.cwl' to 'file:///home/peter/work/cwltool/tests/wf/generator/zing.cwl' |
| 213 | + INFO [job 9073a083-dc79-4719-8762-1c024480605c] /tmp/meeo3d19$ python \ |
| 214 | + inp.py > /tmp/meeo3d19/main.cwl |
| 215 | + INFO [job 9073a083-dc79-4719-8762-1c024480605c] completed success |
| 216 | + INFO [job main.cwl] /tmp/2pqdz5nq$ echo \ |
| 217 | + zoop |
| 218 | + zoop |
| 219 | + INFO [job main.cwl] completed success |
| 220 | + { |
| 221 | + "runProcess": { |
| 222 | + "location": "file:///home/peter/work/cwltool/tests/wf/generator/main.cwl", |
| 223 | + "basename": "main.cwl", |
| 224 | + "class": "File", |
| 225 | + "checksum": "sha1$8c160b680fb2cededef3228a53425e595b8cdf48", |
| 226 | + "size": 111, |
| 227 | + "path": "/home/peter/work/cwltool/tests/wf/generator/main.cwl" |
| 228 | + } |
| 229 | + } |
| 230 | + INFO Final process status is success |
0 commit comments