pipeline_templates

Note:

This is a legacy API. This endpoint is deprecated, disabled by default in new installations, and slated to be removed entirely in a future major release of Arvados. It is replaced by registered workflows.

API endpoint base: https://pirca.arvadosapi.com/arvados/v1/pipeline_templates

Object type: p5p6p

Example UUID: zzzzz-p5p6p-0123456789abcde

Resource

Deprecated. A pipeline template is a collection of jobs that can be instantiated as a pipeline_instance.

Each PipelineTemplate has, in addition to the Common resource fields:

Attribute Type Description Example
name string
components hash

The pipeline template consists of “name” and “components”.

Attribute Type Accepted values Required Description
name string any yes The human-readable name of the pipeline template.
components object JSON object containing job submission objects yes The component jobs that make up the pipeline, with the component name as the key.

Components

The components field of the pipeline template is a JSON object which describes the individual steps that make up the pipeline. Each component is an Arvados job submission. Parameters for job submissions are described on the job method page. In addition, a component can have the following parameters:

Attribute Type Accepted values Required Description
output_name string or boolean string or false no If a string is provided, use this name for the output collection of this component. If the value is false, do not create a permanent output collection (an temporary intermediate collection will still be created). If not provided, a default name will be assigned to the output.

Script parameters

When used in a pipeline, each parameter in the ‘script_parameters’ attribute of a component job can specify that the input parameter must be supplied by the user, or the input parameter should be linked to the output of another component. To do this, the value of the parameter should be JSON object containing one of the following attributes:

Attribute Type Accepted values Description
default any any The default value for this parameter.
required boolean true or false Specifies whether the parameter is required to have a value or not.
dataclass string One of ‘Collection’, ‘File’ 1, ‘number’, or ‘text’ Data type of this parameter.
search_for string any string Substring to use as a default search string when choosing inputs.
output_of string the name of another component in the pipeline Specifies that the value of this parameter should be set to the ‘output’ attribute of the job that corresponds to the specified component.
title string any string User friendly title to display when choosing parameter values
description string any string Extended text description for describing expected/valid values for the script parameter
link_name string any string User friendly name to display for the parameter value instead of the actual parameter value

The ‘output_of’ parameter is especially important, as this is how components are actually linked together to form a pipeline. Component jobs that depend on the output of other components do not run until the parent job completes and has produced output. If the parent job fails, the entire pipeline fails.

1 The ‘File’ type refers to a specific file within a Keep collection in the form ‘collection_hash/filename’, for example ‘887cd41e9c613463eab2f0d885c6dd96+83/bob.txt’.

The ‘search_for’ parameter is meaningful only when input dataclass of type Collection or File is used. If a value is provided, this will be preloaded into the input data chooser dialog in Workbench. For example, if your input dataclass is a File and you are interested in a certain filename extention, you can preconfigure it in this attribute.

Examples

This is a pipeline named “Filter MD5 hash values” with two components, “do_hash” and “filter”. The “input” script parameter of the “do_hash” component is required to be filled in by the user, and the expected data type is “Collection”. This also specifies that the “input” script parameter of the “filter” component is the output of “do_hash”, so “filter” will not run until “do_hash” completes successfully. When the pipeline runs, past jobs that meet the criteria described above may be substituted for either or both components to avoid redundant computation.

{
  "name": "Filter MD5 hash values",
  "components": {
    "do_hash": {
      "script": "hash.py",
      "repository": "you/you",
      "script_version": "main",
      "script_parameters": {
        "input": {
          "required": true,
          "dataclass": "Collection",
          "search_for": ".fastq.gz",
          "title":"Please select a fastq file"
        }
      },
    },
    "filter": {
      "script": "0-filter.py",
      "repository": "you/you",
      "script_version": "main",
      "script_parameters": {
        "input": {
          "output_of": "do_hash"
        }
      },
    }
  }
}

This pipeline consists of three components. The components “thing1” and “thing2” both depend on “cat_in_the_hat”. Once the “cat_in_the_hat” job is complete, both “thing1” and “thing2” can run in parallel, because they do not depend on each other.

{
  "name": "Wreck the house",
  "components": {
    "cat_in_the_hat": {
      "script": "cat.py",
      "repository": "you/you",
      "script_version": "main",
      "script_parameters": { }
    },
    "thing1": {
      "script": "thing1.py",
      "repository": "you/you",
      "script_version": "main",
      "script_parameters": {
        "input": {
          "output_of": "cat_in_the_hat"
        }
      },
    },
    "thing2": {
      "script": "thing2.py",
      "repository": "you/you",
      "script_version": "main",
      "script_parameters": {
        "input": {
          "output_of": "cat_in_the_hat"
        }
      },
    },
  }
}

This pipeline consists of three components. The component “cleanup” depends on “thing1” and “thing2”. Both “thing1” and “thing2” are started immediately and can run in parallel, because they do not depend on each other, but “cleanup” cannot begin until both “thing1” and “thing2” have completed.

{
  "name": "Clean the house",
  "components": {
    "thing1": {
      "script": "thing1.py",
      "repository": "you/you",
      "script_version": "main",
      "script_parameters": { }
    },
    "thing2": {
      "script": "thing2.py",
      "repository": "you/you",
      "script_version": "main",
      "script_parameters": { }
    },
    "cleanup": {
      "script": "cleanup.py",
      "repository": "you/you",
      "script_version": "main",
      "script_parameters": {
        "mess1": {
          "output_of": "thing1"
        },
        "mess2": {
          "output_of": "thing2"
        }
      }
    }
  }
}

Methods

See Common resource methods for more information about create, delete, get, list, and update.

Required arguments are displayed in green.

create

Create a new PipelineTemplate.

Arguments:

Argument Type Description Location Example
pipeline_template object query

delete

Delete an existing PipelineTemplate.

Arguments:

Argument Type Description Location Example
uuid string The UUID of the PipelineTemplate in question. path

get

Gets a PipelineTemplate’s metadata by UUID.

Arguments:

Argument Type Description Location Example
uuid string The UUID of the PipelineTemplate in question. path

list

List pipeline_templates.

See common resource list method.

update

Update attributes of an existing PipelineTemplate.

Arguments:

Argument Type Description Location Example
uuid string The UUID of the PipelineTemplate in question. path
pipeline_template object query

Previous: pipeline_instances Next: nodes

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.