Metadata properties

Arvados allows you to attach arbitrary properties to collection, container_request, link and group records that have a properties field. These are key-value pairs, where the value is a valid JSON type (string, number, null, boolean, array, object).

Searching for records using properties is described in Filtering on subproperties .

Controlling user-supplied properties

Arvados can be configured with a vocabulary file that lists valid properties and the range of valid values for those properties. This is described in Metadata vocabulary .

Arvados offers options to set properties automatically and/or prevent certain properties, once set, from being changed by non-admin users. This is described in Configuring collection’s managed properties .

The admin can require that certain properties must be non-empty before freezing a project .

Reserved properties

Components that ship with Arvados may automatically set properties on objects. These usually help track provenance or provide additional link metadata. These properties usually have a key that starts with arv:, and can always be set even when the system is configured with a strict vocabulary.

Property name Appears on Value type Description
arv:gitBranch container request, collection of type=workflow string When arvados-cwl-runner is run from a Git checkout, this property is set with the name of the branch checked out (the output of git rev-parse --abbrev-ref HEAD)
arv:gitCommitter container request, collection of type=workflow string When arvados-cwl-runner is run from a Git checkout, this property is set with the name and email address of the committer of the most recent commit (the output of git log --format='%cn <%ce>' -n1 HEAD)
arv:gitCommit container request, collection of type=workflow string When arvados-cwl-runner is run from a Git checkout, this property is set with the full checksum of the most recent commit (the output of git log --format='%H' -n1 HEAD)
arv:gitDate container request, collection of type=workflow string When arvados-cwl-runner is run from a Git checkout, this property is set with the commit date of the most recent commit in RFC 2822 format (the output of git log --format='%cD' -n1 HEAD)
arv:gitDescribe container request, collection of type=workflow string When arvados-cwl-runner is run from a Git checkout, this property is set with the name of the most recent tag that is reachable from the most recent commit (the output of git describe --always --tags)
arv:gitOrigin container request, collection of type=workflow string When arvados-cwl-runner is run from a Git checkout, this property is set with the URL of the remote named origin, if set (the output of git remote get-url origin)
arv:gitPath container request, collection of type=workflow string When arvados-cwl-runner is run from a Git checkout, this property is set with the absolute path of the checkout on the filesystem
arv:gitStatus container request, collection of type=workflow string When arvados-cwl-runner is run from a Git checkout, this property is set with a machine-readable summary of files modified in the checkout since the most recent commit (the output of git status --untracked-files=no --porcelain)
arv:workflowMain collection of type=workflow string Set on a collection containing a workflow created by arvados-cwl-runner --create-workflow, this is a relative reference inside the collection to the entry point of the workflow.
arv:failed_container_resubmitted container request uuid Set on container requests that were automatically resubmitted by the workflow runner with modified run options, such as when using the PreemptionBehavior or OutOfMemoryRetry CWL extensions. Set to the uuid of the new, resubmitted container request.

The following system properties predate the arv: key prefix, but are still reserved and can always be set.

Property name Appears on Value type Description
type collection string Appears on collections to indicates the contents or usage. See Collection type values below for details.
container_request collection string The UUID of the container request that produced an output or log collection.
docker-image-repo-tag collection string For collections containing a Docker image, the repo/name:tag identifier
container_uuid collection string The UUID of the container that produced a collection (set on collections with type=log)
container collection string (legacy) The UUID of the container that produced a collection. Set on intermediate collections created by arvados-cwl-runner. Starting with Arvados 2.6.0 arvados-cwl-runner uses container_uuid instead, but older versions may still set the container property.
cwl_input container_request object On an intermediate container request, the CWL workflow-level input parameters used to generate the container request
cwl_output container_request object On an intermediate container request, the CWL workflow-level output parameters collected from the container request
template_uuid container_request string For a workflow runner container request, the workflow record that was used to launch it.
workflowName container_request string For a workflow runner container request, the “name” of the workflow record in template_uuid at the time of launch (used for display only).
username link string For a can_login permission link, the unix username on the VM that the user will have.
groups link array of string For a can_login permission link, the unix groups on the VM that the user will be added to.
image_timestamp link string When resolving a Docker image name and multiple links are found with link_class=docker_image_repo+tag and same link_name, the image_timestamp is used to determine precedence (most recent wins).
filters group array of array of string Used to define filter groups

Collection “type” values

Meaningful values of the type property. These are recognized by Workbench when filtering on types of collections from the project content listing.

Type Description
log The collection contains log files from a container run.
output The collection contains the output of a top-level container run (this is a container request where requesting_container_uuid is null).
intermediate The collection contains the output of a child container run (this is a container request where requesting_container_uuid is non-empty).
workflow A collection created by arvados-cwl-runner --create-workflow containing a workflow definition.

Previous: Projects and filter groups Next: collections

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.