Using storage classes

Storage classes (sometimes called as “storage tiers”) allow you to control which back-end storage volumes should be used to store the data blocks of a particular collection. This can be used to implement data storage policies such as assigning data collections to “fast”, “robust” or “archival” storage.

Names of storage classes are internal to the cluster and decided by the administrator. Aside from “default”, Arvados currently does not define any standard storage class names. Consult your cluster administrator for guidance on what storage classes are available to use on your specific Arvados instance.

Note that when changing the storage class of an existing collection, it does not take effect immediately, the blocks are asynchronously copied to the new storage class and removed from the old one. The collection field “storage_classes_confirmed” is updated to reflect when data blocks have been successfully copied.

arv-put

You may specify one or more desired storage classes for a collection uploaded using arv-put:

$ arv-put --storage-classes=hot,archival myfile.txt

arv-mount

You can ask arv-mount to use specific storage classes when creating new collections:

$ arv-mount --storage-classes=transient --mount-tmp=scratch keep

arvados-cwl-runner

You may specify the desired storage class for the intermediate and final output collections produced by arvados-cwl-runner on the command line or using the arv:OutputStorageClass hint .

$ arvados-cwl-runner --intermediate-storage-classes=hot_storage --storage-classes=robust_storage myworkflow.cwl myinput.yml

arv command line

You may set the storage class on an existing collection by setting the “storage_classes_desired” field of a Collection. For example, at the command line:

$ arv collection update --uuid zzzzz-4zz18-dhhm0ay8k8cqkvg --collection '{"storage_classes_desired": ["archival"]}'

By setting “storage_classes_desired” to “archival”, the blocks that make up the collection will be preferentially moved to keepstore volumes which are configured with the “archival” storage class.

Storage class notes

Collection blocks will be in the cluster’s configured default storage class(es) if not otherwise specified.

Any user with write access to a collection may set any storage class on that collection.


Previous: Using collection versioning Next: Developing CWL Workflows with VSCode

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.