Skip to content

Commit cc3b260

Browse files
tetronmr-c
andauthored
Experimental fast validator code path, using cwl-utils (#1720)
On a very large workflow I was testing with, the validation time went 120 seconds to 20 seconds. * conformance testing: use CWLTOOL_OPTIONS * CWLTOOL_OPTIONS: ignore an empty string * Add loadingContext.skip_resolve_all with note * conformance tests: report CWLTOOL_OPTIONS * CI: include extra options in coverge classname * conformance coverage: normalize paths * Bump cwl-utils version requirement Co-authored-by: Michael R. Crusoe <[email protected]>
1 parent 1d23218 commit cc3b260

11 files changed

+234
-32
lines changed

Diff for: .github/workflows/ci-tests.yml

+6
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,11 @@ jobs:
121121
matrix:
122122
cwl-version: [v1.0, v1.1, v1.2]
123123
container: [docker, singularity, podman]
124+
extras: [""]
125+
include:
126+
- cwl-version: v1.2
127+
container: docker
128+
extras: "--fast-parser"
124129

125130
steps:
126131
- uses: actions/checkout@v3
@@ -141,6 +146,7 @@ jobs:
141146
version: ${{ matrix.cwl-version }}
142147
container: ${{ matrix.container }}
143148
spec_branch: main
149+
CWLTOOL_OPTIONS: ${{ matrix.extras }}
144150
run: ./conformance-test.sh
145151

146152
release_test:

Diff for: README.rst

+13
Original file line numberDiff line numberDiff line change
@@ -667,6 +667,19 @@ given in the following table; all are optional.
667667
+----------------+------------------+----------+------------------------------+
668668

669669

670+
Enabling Fast Parser (experimental)
671+
-----------------------------------
672+
673+
For very large workflows, `cwltool` can spend a lot of time in
674+
initialization, before the first step runs. There is an experimental
675+
flag ``--fast-parser`` which can dramatically reduce the
676+
initialization overhead, however as of this writing it has several limitations:
677+
678+
- Error reporting in general is worse than the standard parser, you will want to use it with workflows that you know are already correct.
679+
680+
- It does not check for dangling links (these will become runtime errors instead of loading errors)
681+
682+
- Several other cases fail, as documented in https://github.com/common-workflow-language/cwltool/pull/1720
670683

671684
===========
672685
Development

Diff for: conformance-test.sh

+11-10
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ pip3 install -U setuptools wheel pip
5656
pip3 uninstall -y cwltool
5757
pip3 install -e .
5858
pip3 install codecov cwltest>=2.1
59+
root_folder=${PWD}
5960
pushd "${repo}-${spec_branch}" || exit 1
6061

6162
# shellcheck disable=SC2043
@@ -71,6 +72,7 @@ cat > "${COVERAGE_RC}" <<EOF
7172
[run]
7273
branch = True
7374
source_pkgs = cwltool
75+
source = ${root_folder}
7476
7577
[report]
7678
exclude_lines =
@@ -92,15 +94,15 @@ chmod a+x "${CWLTOOL_WITH_COV}"
9294
unset exclusions
9395
declare -a exclusions
9496

95-
EXTRA="--parallel"
97+
CWLTOOL_OPTIONS+=" --parallel"
9698
# shellcheck disable=SC2154
9799
if [[ "$version" = *dev* ]]
98100
then
99-
EXTRA+=" --enable-dev"
101+
CWLTOOL_OPTIONS+=" --enable-dev"
100102
fi
101103

102104
if [[ "$container" = "singularity" ]]; then
103-
EXTRA+=" --singularity"
105+
CWLTOOL_OPTIONS+=" --singularity"
104106
# This test fails because Singularity and Docker have
105107
# different views on how to deal with this.
106108
exclusions+=(docker_entrypoint)
@@ -113,13 +115,9 @@ if [[ "$container" = "singularity" ]]; then
113115
exclusions+=(stdin_shorcut)
114116
fi
115117
elif [[ "$container" = "podman" ]]; then
116-
EXTRA+=" --podman"
118+
CWLTOOL_OPTIONS+=" --podman"
117119
fi
118120

119-
if [ -n "$EXTRA" ]
120-
then
121-
EXTRA="EXTRA=${EXTRA}"
122-
fi
123121
if [ "$GIT_BRANCH" = "origin/main" ] && [[ "$version" = "v1.0" ]] && [[ "$container" = "docker" ]]
124122
then
125123
rm -Rf conformance
@@ -133,6 +131,7 @@ Conformance test of cwltool ${tool_ver} for CWL ${version}
133131
Commit: ${GIT_COMMIT}
134132
Python version: 3
135133
Container: ${container}
134+
Extra options: ${CWLTOOL_OPTIONS}
136135
EOM
137136
)
138137

@@ -148,11 +147,13 @@ if (( "${#exclusions[*]}" > 0 )); then
148147
else
149148
EXCLUDE=""
150149
fi
150+
export CWLTOOL_OPTIONS
151+
echo CWLTOOL_OPTIONS="${CWLTOOL_OPTIONS}"
151152
# shellcheck disable=SC2086
152153
LC_ALL=C.UTF-8 ./run_test.sh --junit-xml=result3.xml ${EXCLUDE} \
153154
RUNNER=${CWLTOOL_WITH_COV} "-j$(nproc)" ${BADGE} \
154-
${DRAFT} "${EXTRA}" \
155-
"--classname=py3_${container}"
155+
${DRAFT} \
156+
"--classname=py3_${container}_$(echo ${CWLTOOL_OPTIONS} | tr "[:blank:]-" _)"
156157
# LC_ALL=C is to work around junit-xml ASCII only bug
157158

158159
# capture return code of ./run_test.sh

Diff for: cwltool/argparser.py

+7
Original file line numberDiff line numberDiff line change
@@ -576,6 +576,13 @@ def arg_parser() -> argparse.ArgumentParser:
576576
default=True,
577577
help=argparse.SUPPRESS,
578578
)
579+
parser.add_argument(
580+
"--fast-parser",
581+
dest="fast_parser",
582+
action="store_true",
583+
default=False,
584+
help=argparse.SUPPRESS,
585+
)
579586

580587
reggroup = parser.add_mutually_exclusive_group()
581588
reggroup.add_argument(

Diff for: cwltool/context.py

+18-2
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,18 @@
44
import shutil
55
import tempfile
66
import threading
7-
from typing import IO, Any, Callable, Dict, Iterable, List, Optional, TextIO, Union
7+
from typing import (
8+
IO,
9+
Any,
10+
Callable,
11+
Dict,
12+
Iterable,
13+
List,
14+
Optional,
15+
TextIO,
16+
Tuple,
17+
Union,
18+
)
819

920
# move to a regular typing import when Python 3.3-3.6 is no longer supported
1021
from ruamel.yaml.comments import CommentedMap
@@ -23,6 +34,8 @@
2334
from .utils import DEFAULT_TMP_PREFIX, CWLObjectType, HasReqsHints, ResolverType
2435

2536
if TYPE_CHECKING:
37+
from cwl_utils.parser.cwl_v1_2 import LoadingOptions
38+
2639
from .process import Process
2740
from .provenance import ResearchObject # pylint: disable=unused-import
2841
from .provenance_profile import ProvenanceProfile
@@ -102,7 +115,10 @@ def __init__(self, kwargs: Optional[Dict[str, Any]] = None) -> None:
102115
self.relax_path_checks = False # type: bool
103116
self.singularity = False # type: bool
104117
self.podman = False # type: bool
105-
self.eval_timeout = 60 # type: float
118+
self.eval_timeout: float = 60
119+
self.codegen_idx: Dict[str, Tuple[Any, "LoadingOptions"]] = {}
120+
self.fast_parser = False
121+
self.skip_resolve_all = False
106122

107123
super().__init__(kwargs)
108124

0 commit comments

Comments
 (0)