Arvados Data collections can be downloaded using either the arv commands or using Workbench.
You can also download Arvados data collections using the Workbench.
Visit the Workbench Dashboard. Click on Projects dropdown menu in the top navigation menu, select your Home project. You will see the Data collections tab, which lists the collections in this project.
You can access the contents of a collection by clicking on the Show button next to the collection. This will take you to the collection’s page. Using this page you can see the collection’s contents, and download individual files.
You can now download the collection files by clicking on the button(s).
Collections can be shared with other users on the Arvados cluster by sharing the parent project. Navigate to the parent project using the “breadcrumbs” bar, then click on the Sharing tab. From the sharing tab, you can choose which users or groups to share with, and their level of access.
To share a collection with users that do not have an account on your Arvados cluster, visit the collection page using Workbench as described in the above section. Once on this page, click on the Create sharing link button.
This will create a sharing link for the collection as shown below. You can copy the sharing link in this page and share it with other users.
A user with this url can download this collection by simply accessing this url using browser. It will present a downloadable version of the collection as shown below.
This tutorial assumes that you have access to the Arvados command line tools and have set the API token and confirmed a working environment. .
You can download Arvados data collections using the command line tools arv-ls
and arv-get
.
Use arv-ls
to view the contents of a collection:
~$ arv-ls ae480c5099b81e17267b7445e35b4bc7+180
./HWI-ST1027_129_D0THKACXX.1_1.fastq
./HWI-ST1027_129_D0THKACXX.1_2.fastq
Use @-s@ to print file sizes, in kilobytes, rounded up:
~$ arv-ls -s ae480c5099b81e17267b7445e35b4bc7+180
12258 ./HWI-ST1027_129_D0THKACXX.1_1.fastq
12258 ./HWI-ST1027_129_D0THKACXX.1_2.fastq
Use arv-get
to download the contents of a collection and place it in the directory specified in the second argument (in this example, .
for the current directory):
~$ $ arv-get ae480c5099b81e17267b7445e35b4bc7+180/ .
23 MiB / 23 MiB 100.0%
~$ ls
HWI-ST1027_129_D0THKACXX.1_1.fastq HWI-ST1027_129_D0THKACXX.1_2.fastq
You can also download individual files:
~$ arv-get ae480c5099b81e17267b7445e35b4bc7+180/HWI-ST1027_129_D0THKACXX.1_1.fastq .
11 MiB / 11 MiB 100.0%
If your cluster is configured to be part of a federation you can also download collections hosted on other clusters (with appropriate permissions).
If you request a collection by portable data hash, it will first search the home cluster, then search federated clusters.
You may also request a collection by UUID. In this case, it will contact the cluster named in the UUID prefix (in this example, zzzzz
).
~$ arv-get zzzzz-4zz18-fw6dnjxtkvzdewt/ .
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.