Skip to content

Commit 318876b

Browse files
tetronmr-c
andauthored
Process generator docs (#1599)
* add install-doc-dep target Co-authored-by: Michael R. Crusoe <[email protected]>
1 parent fb7aa41 commit 318876b

File tree

3 files changed

+235
-0
lines changed

3 files changed

+235
-0
lines changed

Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,9 @@ install-dependencies: FORCE
5050
pip install --upgrade $(DEVPKGS)
5151
pip install -r requirements.txt
5252

53+
install-doc-dep:
54+
pip install -r docs/requirements.txt
55+
5356
## install-deb-dep: install most of the dev dependencies via apt-get
5457
install-deb-dep:
5558
sudo apt-get install $(DEBDEVPKGS)

docs/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ Modules
1919
:maxdepth: 2
2020
:caption: Contents:
2121

22+
processgen
23+
2224
Indices and tables
2325
==================
2426

docs/processgen.rst

Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
=================
2+
Process generator
3+
=================
4+
5+
Experimental feature and unofficial extension to the CWL standards.
6+
7+
A process generator is a CWL Process type that executes a concrete CWL
8+
process (CommandLineTool, Workflow or ExpressionTool) which produces
9+
CWL files as output, then executes the CWL that was generated.
10+
11+
The intention is to have a formalized way to express a pre-processing
12+
or bootstrapping step in which a CWL description is generated by
13+
another program (such as from a template, or conversion from another
14+
workflow language).
15+
16+
The ProcessGenerator is a subtype of CWL process, so it must define
17+
its inputs and outputs. The "run" field is similar to the "run" field
18+
of a workflow step -- it specifies a tool to run that will create new
19+
CWL as output.
20+
21+
.. code:: yaml
22+
23+
- name: ProcessGenerator
24+
type: record
25+
inVocab: true
26+
extends: cwl:Process
27+
documentRoot: true
28+
fields:
29+
- name: class
30+
jsonldPredicate:
31+
"_id": "@type"
32+
"_type": "@vocab"
33+
type: string
34+
- name: run
35+
type: [string, cwl:Process]
36+
jsonldPredicate:
37+
_id: "cwl:run"
38+
_type: "@id"
39+
subscope: run
40+
doc: |
41+
Specifies the process to run.
42+
43+
44+
Process generator example (pytoolgen.cwl)
45+
46+
.. code:: yaml
47+
48+
#!/usr/bin/env cwl-runner
49+
cwlVersion: v1.0
50+
$namespaces:
51+
cwltool: "http://commonwl.org/cwltool#"
52+
class: cwltool:ProcessGenerator
53+
inputs:
54+
script: string
55+
dir: Directory
56+
outputs: {}
57+
run:
58+
class: CommandLineTool
59+
inputs:
60+
script: string
61+
dir: Directory
62+
outputs:
63+
runProcess:
64+
type: File
65+
outputBinding:
66+
glob: main.cwl
67+
requirements:
68+
InlineJavascriptRequirement: {}
69+
cwltool:LoadListingRequirement:
70+
loadListing: shallow_listing
71+
InitialWorkDirRequirement:
72+
listing: |
73+
${
74+
var v = inputs.dir.listing;
75+
v.push({entryname: "inp.py", entry: inputs.script});
76+
return v;
77+
}
78+
arguments: [python, inp.py]
79+
stdout: main.cwl
80+
81+
82+
The process generator has two required inputs: "script" and "dir". It
83+
runs the command line tool listed inline in "run" with the input
84+
object, which is required to have those parameters. Note: the input
85+
object may contain additional parameters which are intended for the
86+
generated CWL when it is executed.
87+
88+
The command line tool populates the working directory using
89+
InitialWorkDirRequirement. It uses the listing from 'dir' and adds a
90+
new file literal called "inp.py" which contains the text from the
91+
input parameter "script". Then it runs "python inp.py".
92+
93+
The output of this command line tool is the File parameter
94+
"runProcess". In this example, the "inp.py" script, when run, is
95+
expected to print the CWL description to standard output, which will
96+
be captured in the "runProcess" output parameter.
97+
98+
Next, the ProcessGenerator will load file in the "runProcess"
99+
parameter, which in this example is "main.cwl". Finally, it will
100+
execute the process with input object that was originally provided to
101+
the process generator.
102+
103+
The output of the generated script is used as the output for
104+
ProcessGenerator as a whole.
105+
106+
107+
Here's an example (zing.cwl) that uses pytoolgen.cwl.
108+
109+
.. code:: yaml
110+
111+
#!/usr/bin/env cwltool
112+
{cwl:tool: pytoolgen.cwl, script: {$include: "#attachment-1"}, dir: {class: Directory, location: .}}
113+
--- |
114+
import os
115+
import sys
116+
print("""
117+
cwlVersion: v1.0
118+
class: CommandLineTool
119+
inputs:
120+
zing: string
121+
outputs: {}
122+
arguments: [echo, $(inputs.zing)]
123+
""")
124+
125+
The first line ``#!/usr/bin/env cwltool`` means that this file can be
126+
given the executable bit (+x) and then run directly.
127+
128+
This is a multi-part YAML file. The first section is a CWL input
129+
object.
130+
131+
The input object uses "cwl:tool" to indicate that this input object
132+
should be used as input to execute "pytoolgen.cwl".
133+
134+
The parameter ``script: {$include: "#attachment-1"}`` takes the text
135+
from the second part of the file (following the YAML division marker
136+
``--- |``) and assigns it as a string value to "script".
137+
138+
The "dir" parameter is not doing much in this example, but by
139+
capturing the whole directory it allows the Python script to refer to
140+
files in the current directory.
141+
142+
In this example the script is trivially printing CWL as a string, but
143+
of course could do something much more complex: generate code from a
144+
template, select among several possible workflows based on the input,
145+
convert from another workflow language, etc.
146+
147+
When this is executed, the following steps happen:
148+
149+
#. pytoolgen.py is loaded and executed with the 1st part of the file as the input object
150+
151+
#. The "script" parameter contains the contents of the second part.
152+
The inline command line tool creates a file called "inp.py" with
153+
the contents of "script"
154+
155+
#. The inline command line tool runs python on "inp.py" and collects
156+
the output, which is CWL description for a trivial "echo" tool.
157+
158+
#. It loads the CWL description and executes it with any additional
159+
parameters declared in the input object or command line.
160+
161+
162+
Example runs
163+
------------
164+
165+
Note: requires ``cwltool`` flags ``--enable-ext`` and ``--enable-dev``
166+
167+
You can set these with the environment parameter CWLTOOL_OPTIONS
168+
169+
.. code::
170+
171+
$ export CWLTOOL_OPTIONS="--enable-dev --enable-ext"
172+
173+
$ ./zing.cwl
174+
INFO /home/peter/work/cwltool/venv3/bin/cwltool 3.1.20211112163758
175+
INFO Resolved './zing.cwl' to 'file:///home/peter/work/cwltool/tests/wf/generator/zing.cwl'
176+
INFO [job d3626216-d7d8-4322-bc21-4d469634cc9a] /tmp/8sez90gb$ python \
177+
inp.py > /tmp/8sez90gb/main.cwl
178+
INFO [job d3626216-d7d8-4322-bc21-4d469634cc9a] completed success
179+
usage: ./zing.cwl [-h] --zing ZING [job_order]
180+
./zing.cwl: error: the following arguments are required: --zing
181+
182+
183+
.. code::
184+
185+
$ ./zing.cwl --zing blurf
186+
INFO /home/peter/work/cwltool/venv3/bin/cwltool 3.1.20211112163758
187+
INFO Resolved './zing.cwl' to 'file:///home/peter/work/cwltool/tests/wf/generator/zing.cwl'
188+
INFO [job a580b69d-2b88-4268-904e-ed105ba7c85e] /tmp/ujff239o$ python \
189+
inp.py > /tmp/ujff239o/main.cwl
190+
INFO [job a580b69d-2b88-4268-904e-ed105ba7c85e] completed success
191+
INFO [job main.cwl] /tmp/f_7bxncq$ echo \
192+
blurf
193+
blurf
194+
INFO [job main.cwl] completed success
195+
{
196+
"runProcess": {
197+
"location": "file:///home/peter/work/cwltool/tests/wf/generator/main.cwl",
198+
"basename": "main.cwl",
199+
"class": "File",
200+
"checksum": "sha1$8c160b680fb2cededef3228a53425e595b8cdf48",
201+
"size": 111,
202+
"path": "/home/peter/work/cwltool/tests/wf/generator/main.cwl"
203+
}
204+
}
205+
INFO Final process status is success
206+
207+
.. code::
208+
209+
$ echo "zing: zoop" > job.yml
210+
$ ./zing.cwl job.yml
211+
INFO /home/peter/work/cwltool/venv3/bin/cwltool 3.1.20211112163758
212+
INFO Resolved './zing.cwl' to 'file:///home/peter/work/cwltool/tests/wf/generator/zing.cwl'
213+
INFO [job 9073a083-dc79-4719-8762-1c024480605c] /tmp/meeo3d19$ python \
214+
inp.py > /tmp/meeo3d19/main.cwl
215+
INFO [job 9073a083-dc79-4719-8762-1c024480605c] completed success
216+
INFO [job main.cwl] /tmp/2pqdz5nq$ echo \
217+
zoop
218+
zoop
219+
INFO [job main.cwl] completed success
220+
{
221+
"runProcess": {
222+
"location": "file:///home/peter/work/cwltool/tests/wf/generator/main.cwl",
223+
"basename": "main.cwl",
224+
"class": "File",
225+
"checksum": "sha1$8c160b680fb2cededef3228a53425e595b8cdf48",
226+
"size": 111,
227+
"path": "/home/peter/work/cwltool/tests/wf/generator/main.cwl"
228+
}
229+
}
230+
INFO Final process status is success

0 commit comments

Comments
 (0)