DockerRequirement
in the hints
section.$(inputs.file.basename).ext
instead of $(inputs.file.basename + 'ext')
. The first form is evaluated as a simple text substitution, the second form (using the +
operator) is evaluated as an arbitrary Javascript expression and requires that you declare InlineJavascriptRequirement
.InlineJavascriptRequirement
or ShellCommandRequirement
unless you specifically need them. Don’t include them “just in case” because they change the default behavior and may imply extra overhead.cwltool
or crunch v2.secondaryFiles
for scripts that consist of multiple files. For example:cwlVersion: v1.0 class: CommandLineTool baseCommand: python inputs: script: type: File inputBinding: {position: 1} default: class: File location: bclfastq.py secondaryFiles: - class: File location: helper1.py - class: File location: helper2.py inputfile: type: File inputBinding: {position: 2} outputs: out: type: File outputBinding: glob: "*.fastq"
$(runtime.tmpdir)
in your CWL file, or from the $TMPDIR
environment variable in your script.HOME
environment variable in your script.ExpressionTool
to efficiently rearrange input files between steps of a Workflow. For example, the following expression accepts a directory containing files paired by _R1_
and _R2_
and produces an array of Directories containing each pair.class: ExpressionTool cwlVersion: v1.0 inputs: inputdir: Directory outputs: out: Directory[] requirements: InlineJavascriptRequirement: {} expression: | ${ var samples = {}; for (var i = 0; i < inputs.inputdir.listing.length; i++) { var file = inputs.inputdir.listing[i]; var groups = file.basename.match(/^(.+)(_R[12]_)(.+)$/); if (groups) { if (!samples[groups[1]]) { samples[groups[1]] = []; } samples[groups[1]].push(file); } } var dirs = []; for (var key in samples) { dirs.push({"class": "Directory", "basename": key, "listing": [samples[key]]}); } return {"out": dirs}; }
hints
section, and individual steps can override it with their own resource requirement.cwlVersion: v1.0 class: Workflow inputs: inp: File hints: ResourceRequirement: ramMin: 1000 coresMin: 1 tmpdirMin: 45000 steps: step1: in: {inp: inp} out: [out] run: tool1.cwl step2: in: {inp: step1/inp} out: [out] run: tool2.cwl hints: ResourceRequirement: ramMin: 2000 coresMin: 2 tmpdirMin: 90000
With the following pattern, step1
has to wait for all samples to complete before step2
can start computing on any samples. This means a single long-running sample can prevent the rest of the workflow from moving on:
cwlVersion: v1.0 class: Workflow inputs: inp: File steps: step1: in: {inp: inp} scatter: inp out: [out] run: tool1.cwl step2: in: {inp: step1/inp} scatter: inp out: [out] run: tool2.cwl step3: in: {inp: step2/inp} scatter: inp out: [out] run: tool3.cwl
Instead, scatter over a subworkflow. In this pattern, a sample can proceed to step2
as soon as step1
is done, independently of any other samples.
Example: (note, the subworkflow can also be put in a separate file)
cwlVersion: v1.0 class: Workflow steps: step1: in: {inp: inp} scatter: inp out: [out] run: class: Workflow inputs: inp: File outputs: out: type: File outputSource: step3/out steps: step1: in: {inp: inp} out: [out] run: tool1.cwl step2: in: {inp: step1/inp} out: [out] run: tool2.cwl step3: in: {inp: step2/inp} out: [out] run: tool3.cwl
/dir/subdir/file1.txt
, a tool will not be able to implicitly access the file /dir/file2.txt
. Use secondaryFiles
or a Directory
input to describe trees of files.InitialWorkDirRequirement
appear in the output directory as normal files (not symlinks) but cannot be moved, renamed or deleted. These files will be added to the output collection but without any additional copies of the underlying data.arv:APIRequirement: {}
in their requirements
section.
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.