Getting Started with CWL

This training will walk you through the development of a best-practices Common Workflow Language (CWL) workflow. At the conclusion of this training, you should have a grasp of the essential components of a workflow, and have a basis for learning more.

Prerequisites

This training assumes some basic familiarity with editing text files, the Unix command line, and Unix shell scripts.

Specific knowledge of the biology of RNA-seq is not a prerequisite for these lessons. Although orignally developed to solve big data problems in genomics, CWL is not domain specific to bioinformatics, and is used in a number of other fields including medical imaging, astronomy, geospatial imaging, and machine learning. We hope that you will find this training useful regardless of your area of research.

These lessons are based on Introduction to RNA-seq using high-performance computing (HPC) lessons developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). The original training, which includes additional lectures about the biology of RNA-seq, can be found at that link.

Schedule

Setup Download files required for the lesson
00:00 1. Introduction What is CWL?
What is the goal of this training?
00:10 2. Create a Workflow by Composing Tools What is the syntax of CWL?
What are the key components of a workflow?
00:50 3. Running and Debugging a Workflow How do I provide input to run a workflow?
What should I do if the workflow fails?
01:25 4. Writing a Tool Wrapper What are the key components of a tool wrapper?
How do I use software containers to supply the software I want to run?
02:15 5. Analyzing Multiple Samples How can you run the same workflow over multiple samples?
03:15 6. Dynamic Workflow Behavior What kind of custom logic can happen between steps?
03:45 7. Resources for further learning Where should I go to learn more?
03:55 8. Supplement: Creating Docker Images for Workflows How do I create Docker images from scratch?
What some best practices for Docker images?
04:06 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.