Arvados Data collections can be uploaded using either Workbench or the
arv-put command line tool.
To upload using Workbench, visit the Workbench Dashboard. Click on Projects dropdown menu in the top navigation menu and select your Home project or any other project of your choosing. You will see the Data collections tab for this project, which lists the collections in this project.
To upload files into a new collection, click on Add data dropdown menu and select Upload files from my computer.
This will create a new empty collection in your chosen project and will take you to the Upload tab for that collection.
Click on the Browse… button and select the files you would like to upload. Selected files will be added to a list of files to be uploaded. After you are done selecting files to upload, click on the Start button to start upload. This will start uploading files to Arvados and Workbench will show you the progress bar. When upload is completed, you will see an indication to that effect.
Note: If you leave the collection page during the upload, the upload process will be aborted and you will need to upload the files again.
Note: You can also use the Upload tab to add additional files to an existing collection.
Files are organized into Collections, and Collections are organized by Projects.
Click on Projects → Add a new project to add a top level project.
To create a subproject, navigate to the parent project, and click on Add a subproject.
See Sharing collections for information about sharing projects and collections with other users.
This tutorial assumes that you have access to the Arvados command line tools and have set the API token and confirmed a working environment. .
To upload a file to Keep using
~$ arv-put var-GS000016015-ASM.tsv.bz2 216M / 216M 100.0% Collection saved as ... zzzzz-4zz18-xxxxxxxxxxxxxxx
The output value
zzzzz-4zz18-xxxxxxxxxxxxxxx is the uuid of the Arvados collection created.
Note: The file used in this example is a freely available TSV file containing variant annotations from the Personal Genome Project participant hu599905), downloadable here. Alternatively, you can replace
var-GS000016015-ASM.tsv.bz2 with the name of any file you have locally, or you could get the TSV file by downloading it from Keep.
It is also possible to upload an entire directory with
~$ mkdir tmp ~$ echo "hello alice" > tmp/alice.txt ~$ echo "hello bob" > tmp/bob.txt ~$ echo "hello carol" > tmp/carol.txt ~$ arv-put tmp 0M / 0M 100.0% Collection saved as ... zzzzz-4zz18-yyyyyyyyyyyyyyy
In both examples, the
arv-put command created a collection. The first collection contains the single uploaded file. The second collection contains the entire uploaded directory.
arv-put accepts quite a few optional command line arguments, which are described on the arv subcommands page.
Visit the Workbench Dashboard. Click on Projects dropdown menu in the top navigation menu, select your Home project. Your newly uploaded collection should appear near the top of the Data collections tab. The collection name printed by
arv-put will appear under the name column.
To move the collection to a different project, check the box at the left of the collection row. Pull down the Selection… menu near the top of the page tab, and select Move selected… button. This will open a dialog box where you can select a destination project for the collection. Click a project, then finally the Move button.
Click on the Show button next to the collection’s listing on a project page to go to the Workbench page for your collection. On this page, you can see the collection’s contents, download individual files, and set sharing options.
The content of this documentation is licensed under the
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.