Downloading data

Arvados Data collections can be downloaded using either the arv commands or using Workbench.

  1. Download using Workbench
  2. Creating a special download URL for a collection
  3. Download using command line tools

Download using Workbench

You can also download Arvados data collections using the Workbench.

When you visit a project in Workbench (for instance, the Home Projects or any projects under it), the collections will show up on the project details page, with “Data collection” in the Type column.

Clicking on a collection will bring you to its details page. There, the lower panel acts like a file manager where you can navigate to or search for files, select them for actions, and download them.

To download a file, simply click on the file, or bring up the context menu using right-click or the triple-dot button on its row, and then select the menu item Download.

Creating a special download URL for a collection

To share a collection with users that do not have an account on your Arvados cluster, locate the collection and then go to the Sharing settings dialog box as described above. There, select the SHARING URLS tab.

You can then generate a new sharing URL using the CREATE SHARING URL button, with the option to set an expiration time for the URL. You can then copy the URL to the clipboard for sharing with others. To revoke (that is, delete) a sharing URL, click on the cross icon beside it.

The SHARING URLS tab in the Sharing settings dialog box, showing the created URL with an expiration time

Any user with the sharing URL can download this collection by simply accessing this URL using browser. It will present a downloadable version of the collection as shown below.

When a collection is being shared by URL, in the WITH USERS/GROUS tab of Sharing settings, the following message will appear if General access is Private: Although there aren’t specific permissions set, this is publicly accessible via Sharing URL.

  • Note: Sharing by URL is specific to collections. Projects or individual files cannot be shared in this way.

Download using command line tools

Note:

This tutorial assumes that you have access to Arvados command line tools, configured your API token, and confirmed a working environment.

You can download Arvados data collections using the command line tools arv-ls and arv-get.

Use arv-ls to view the contents of a collection:

~$ arv-ls ae480c5099b81e17267b7445e35b4bc7+180
./HWI-ST1027_129_D0THKACXX.1_1.fastq
./HWI-ST1027_129_D0THKACXX.1_2.fastq
Use @-s@ to print file sizes, in kilobytes, rounded up:
~$ arv-ls -s ae480c5099b81e17267b7445e35b4bc7+180
     12258 ./HWI-ST1027_129_D0THKACXX.1_1.fastq
     12258 ./HWI-ST1027_129_D0THKACXX.1_2.fastq

Use arv-get to download the contents of a collection and place it in the directory specified in the second argument (in this example, . for the current directory):

~$ $ arv-get ae480c5099b81e17267b7445e35b4bc7+180/ .
23 MiB / 23 MiB 100.0%
~$ ls
HWI-ST1027_129_D0THKACXX.1_1.fastq  HWI-ST1027_129_D0THKACXX.1_2.fastq

You can also download individual files:

~$ arv-get ae480c5099b81e17267b7445e35b4bc7+180/HWI-ST1027_129_D0THKACXX.1_1.fastq .
11 MiB / 11 MiB 100.0%

Federated downloads

If your cluster is configured to be part of a federation you can also download collections hosted on other clusters (with appropriate permissions).

If you request a collection by portable data hash, it will first search the home cluster, then search federated clusters.

You may also request a collection by UUID. In this case, it will contact the cluster named in the UUID prefix (in this example, zzzzz).

~$ arv-get zzzzz-4zz18-fw6dnjxtkvzdewt/ .

Previous: Uploading data Next: Trashing and untrashing data

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.