Uploading data

Arvados Data collections can be uploaded using either the arv keep put command line tool or using Workbench.

  1. Upload using command line tool
  2. Upload using Workbench

Upload using command line tool

Note:

This tutorial assumes that you are logged into an Arvados VM instance (instructions for Webshell or Unix or Windows) or you have installed the Arvados FUSE Driver and Python SDK on your workstation and have a working environment.

To upload a file to Keep using arv keep put:

~$ arv keep put var-GS000016015-ASM.tsv.bz2
216M / 216M 100.0%
Collection saved as ...
qr1hi-4zz18-xxxxxxxxxxxxxxx

The output value qr1hi-4zz18-xxxxxxxxxxxxxxx is the uuid of the Arvados collection created.

Note: The file used in this example is a freely available TSV file containing variant annotations from the Personal Genome Project participant hu599905), downloadable here. Alternatively, you can replace var-GS000016015-ASM.tsv.bz2 with the name of any file you have locally, or you could get the TSV file by downloading it from Keep.

It is also possible to upload an entire directory with arv keep put:

~$ mkdir tmp
~$ echo "hello alice" > tmp/alice.txt
~$ echo "hello bob" > tmp/bob.txt
~$ echo "hello carol" > tmp/carol.txt
~$ arv keep put tmp
0M / 0M 100.0%
Collection saved as ...
qr1hi-4zz18-yyyyyyyyyyyyyyy

In both examples, the arv keep put command created a collection. The first collection contains the single uploaded file. The second collection contains the entire uploaded directory.

arv keep put accepts quite a few optional command line arguments, which are described on the arv subcommands page.

Locate your collection in Workbench

Visit the Workbench Dashboard. Click on Projects dropdown menu in the top navigation menu, select your Home project. Your newly uploaded collection should appear near the top of the Data collections tab. The collection name printed by arv keep put will appear under the name column.

To move the collection to a different project, check the box at the left of the collection row. Pull down the Selection… menu near the top of the page tab, and select Move selected… button. This will open a dialog box where you can select a destination project for the collection. Click a project, then finally the Move button.

Click on the Show button next to the collection’s listing on a project page to go to the Workbench page for your collection. On this page, you can see the collection’s contents, download individual files, and set sharing options.

Upload using Workbench

To upload using Workbench, visit the Workbench Dashboard. Click on Projects dropdown menu in the top navigation menu and select your Home project or any other project of your choosing. You will see the Data collections tab for this project, which lists the collections in this project.

To upload files into a new collection, click on Add data dropdown menu and select Upload files from my computer.


This will create a new empty collection in your chosen project and will take you to the Upload tab for that collection.

Click on the Browse… button and select the files you would like to upload. Selected files will be added to a list of files to be uploaded. After you are done selecting files to upload, click on the Start button to start upload. This will start uploading files to Arvados and Workbench will show you the progress bar. When upload is completed, you will see an indication to that effect.

Note: If you leave the collection page during the upload, the upload process will be aborted and you will need to upload the files again.

Note: You can also use the Upload tab to add additional files to an existing collection.


Previous: Getting an API token Next: Downloading data

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.