DockerRequirement in the hints section.$(inputs.file.basename).ext instead of $(inputs.file.basename + 'ext'). The first form is evaluated as a simple text substitution, the second form (using the + operator) is evaluated as an arbitrary Javascript expression and requires that you declare InlineJavascriptRequirement.InlineJavascriptRequirement or ShellCommandRequirement unless you specifically need them. Don’t include them “just in case” because they change the default behavior and may imply extra overhead.cwltool or crunch v2.secondaryFiles for scripts that consist of multiple files. For example:
cwlVersion: v1.0
class: CommandLineTool
baseCommand: python
inputs:
script:
type: File
inputBinding: {position: 1}
default:
class: File
location: bclfastq.py
secondaryFiles:
- class: File
location: helper1.py
- class: File
location: helper2.py
inputfile:
type: File
inputBinding: {position: 2}
outputs:
out:
type: File
outputBinding:
glob: "*.fastq"
$(runtime.tmpdir) in your CWL file, or from the $TMPDIR environment variable in your script.HOME environment variable in your script.ExpressionTool to efficiently rearrange input files between steps of a Workflow. For example, the following expression accepts a directory containing files paired by _R1_ and _R2_ and produces an array of Directories containing each pair.
class: ExpressionTool
cwlVersion: v1.0
inputs:
inputdir: Directory
outputs:
out: Directory[]
requirements:
InlineJavascriptRequirement: {}
expression: |
${
var samples = {};
for (var i = 0; i < inputs.inputdir.listing.length; i++) {
var file = inputs.inputdir.listing[i];
var groups = file.basename.match(/^(.+)(_R[12]_)(.+)$/);
if (groups) {
if (!samples[groups[1]]) {
samples[groups[1]] = [];
}
samples[groups[1]].push(file);
}
}
var dirs = [];
for (var key in samples) {
dirs.push({"class": "Directory",
"basename": key,
"listing": [samples[key]]});
}
return {"out": dirs};
}
hints section, and individual steps can override it with their own resource requirement.
cwlVersion: v1.0
class: Workflow
inputs:
inp: File
hints:
ResourceRequirement:
ramMin: 1000
coresMin: 1
tmpdirMin: 45000
steps:
step1:
in: {inp: inp}
out: [out]
run: tool1.cwl
step2:
in: {inp: step1/inp}
out: [out]
run: tool2.cwl
hints:
ResourceRequirement:
ramMin: 2000
coresMin: 2
tmpdirMin: 90000
With the following pattern, step1 has to wait for all samples to complete before step2 can start computing on any samples. This means a single long-running sample can prevent the rest of the workflow from moving on:
cwlVersion: v1.0
class: Workflow
inputs:
inp: File
steps:
step1:
in: {inp: inp}
scatter: inp
out: [out]
run: tool1.cwl
step2:
in: {inp: step1/inp}
scatter: inp
out: [out]
run: tool2.cwl
step3:
in: {inp: step2/inp}
scatter: inp
out: [out]
run: tool3.cwl
Instead, scatter over a subworkflow. In this pattern, a sample can proceed to step2 as soon as step1 is done, independently of any other samples.
Example: (note, the subworkflow can also be put in a separate file)
cwlVersion: v1.0
class: Workflow
steps:
step1:
in: {inp: inp}
scatter: inp
out: [out]
run:
class: Workflow
inputs:
inp: File
outputs:
out:
type: File
outputSource: step3/out
steps:
step1:
in: {inp: inp}
out: [out]
run: tool1.cwl
step2:
in: {inp: step1/inp}
out: [out]
run: tool2.cwl
step3:
in: {inp: step2/inp}
out: [out]
run: tool3.cwl
When migrating from jobs API (—api=jobs) (sometimes referred to as “crunch v1”) to the containers API (—api=containers) (“crunch v2”) there are a few differences in behavior:
/dir/subdir/file1.txt, a tool will not be allowed to implicitly access a file in the parent directory /dir/file2.txt. Use secondaryFiles or a Directory for files that need to be grouped together.InitialWorkDirRequirement appear in the output directory as normal files (not symlinks) but cannot be moved, renamed or deleted unless marked as “writable” in CWL. These files will be added to the output collection but without any additional copies of the underlying data.arv:APIRequirement: {} to the requirements section.
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.