Arvados pipeline templates are deprecated. The recommended way to develop new workflows for Arvados is using the Common Workflow Language.
Several crunch scripts are included with Arvados in the /crunch_scripts directory. They are intended to provide examples and starting points for writing your own scripts.
Run the bwa aligner on a set of paired-end fastq files, producing a BAM file for each pair. View source.
Parameter | Description | Example |
bwa_tbz | Collection with the bwa source distribution. | 8b6e2c4916133e1d859c9e812861ce13+70 |
samtools_tgz | Collection with the samtools source distribution. | c777e23cf13e5d5906abfdc08d84bfdb+74 |
input | Collection with fastq reads (pairs of *_1.fastq.gz and *_2.fastq.gz). | d0136bc494c21f79fc1b6a390561e6cb+2778 |
Generate an index of a fasta reference genome suitable for use by bwa-aln. View source.
Parameter | Description | Example |
bwa_tbz | Collection with the bwa source distribution. | 8b6e2c4916133e1d859c9e812861ce13+70 |
input | Collection with reference data (*.fasta.gz, *.fasta.fai.gz, *.dict.gz). | c361dbf46ee3397b0958802b346e9b5a+925 |
Using the FixMateInformation, SortSam, ReorderSam, AddOrReplaceReadGroups, and BuildBamIndex modules from picard, prepare a BAM file for use with the GATK2 tools. Additionally, run picard’s CollectAlignmentSummaryMetrics module to produce a *.casm.tsv
statistics file for each BAM file. View source.
Parameter | Description | Example |
input | Collection containing aligned bam files. | |
picard_zip | Collection with the picard binary distribution. | 687f74675c6a0e925dec619cc2bec25f+77 |
reference | Collection with reference data (*.fasta.gz, *.fasta.fai.gz, *.dict.gz). | c361dbf46ee3397b0958802b346e9b5a+925 |
Run GATK’s RealignerTargetCreator and IndelRealigner modules on a set of BAM files. View source.
Parameter | Description | Example |
input | Collection containing aligned bam files. | |
picard_zip | Collection with the picard binary distribution. | 687f74675c6a0e925dec619cc2bec25f+77 |
gatk_tbz | Collection with the GATK2 binary distribution. | 7e0a277d6d2353678a11f56bab3b13f2+87 |
gatk_bundle | Collection with the GATK data bundle. | d237a90bae3870b3b033aea1e99de4a9+10820 |
known_sites | List of files in the data bundle to use as GATK -known arguments. Optional. |
["dbsnp_137.b37.vcf","Mills_and_1000G_gold_standard.indels.b37.vcf"] (this is the default value) |
regions | Collection with .bed files indicating sequencing target regions. Optional. | |
region_padding | Corresponds to GATK --interval_padding argument. Required if a regions parameter is given. |
10 |
Run GATK’s BaseQualityScoreRecalibration module on a set of BAM files. View source.
Parameter | Description | Example |
input | Collection containing bam files. | |
gatk_tbz | Collection with the GATK2 binary distribution. | 7e0a277d6d2353678a11f56bab3b13f2+87 |
gatk_bundle | Collection with the GATK data bundle. | d237a90bae3870b3b033aea1e99de4a9+10820 |
Merge a set of BAM files using picard, and run GATK’s UnifiedGenotyper module on the merged set to produce a VCF file. View source.
Parameter | Description | Example |
input | Collection containing bam files. | |
picard_zip | Collection with the picard binary distribution. | 687f74675c6a0e925dec619cc2bec25f+77 |
gatk_tbz | Collection with the GATK2 binary distribution. | 7e0a277d6d2353678a11f56bab3b13f2+87 |
gatk_bundle | Collection with the GATK data bundle. | d237a90bae3870b3b033aea1e99de4a9+10820 |
regions | Collection with .bed files indicating sequencing target regions. Optional. | |
region_padding | Corresponds to GATK --interval_padding argument. Required if a regions parameter is given. |
10 |
Pass through the named files from input to output collection, and ignore the rest. View source.
Parameter | Description | Example |
names | List of filenames to include in the output. | ["human_g1k_v37.fasta.gz","human_g1k_v37.fasta.fai.gz"] |
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.