Arvados can run in a variety of configurations. For compute scheduling, Arvados supports HPC clusters using
slurm, and supports elastic cloud computing on AWS, Google and Azure. For storage, Arvados can store blocks on regular file systems such as ext4 or xfs, on network file systems such as GPFS, or object storage such as Azure blob storage, Amazon S3, and other object storage that supports the S3 API including Google Cloud Storage and Ceph.
This guide assumes you have seven systems available in the same network subnet:
|Function||Number of nodes|
|Arvados API, Crunch dispatcher, Git, Websockets and Workbench||1|
|Arvados Compute node||1|
|Arvados Keepproxy and Keep-web server||1|
|Arvados Keepstore servers||2|
|Arvados Shell server||1|
|Arvados SSO server||1|
The number of Keepstore, shell and compute nodes listed above is a minimum. In a real production installation, you will likely run many more of each of those types of nodes. In such a scenario, you would probably also want to dedicate a node to the Workbench server and Crunch dispatcher, respectively. For performance reasons, you may want to run the database server on a separate node as well.
|Distribution||State||Last supported version|
|Debian 8 (“jessie”)||Supported||Latest|
|Debian 9 (“stretch”)||Supported||Latest|
|Ubuntu 14.04 (“trusty”)||Supported||Latest|
|Ubuntu 16.04 (“xenial”)||Supported||Latest|
|Ubuntu 18.04 (“bionic”)||Supported||Latest|
|Ubuntu 12.04 (“precise”)||EOL||8ed7b6dd5d4df93a3f37096afe6d6f81c2a7ef6e (2017-05-03)|
|Debian 7 (“wheezy”)||EOL||997479d1408139e96ecdb42a60b4f727f814f6c9 (2016-12-28)|
|CentOS 6||EOL||997479d1408139e96ecdb42a60b4f727f814f6c9 (2016-12-28)|
On any host where you install Arvados software, you’ll need to set up an Arvados package repository. They’re available for several popular distributions.
Packages are available for CentOS 7. To install them with yum, save this configuration block in
[arvados] name=Arvados baseurl=http://rpm.arvados.org/CentOS/$releasever/os/$basearch/ gpgcheck=1 gpgkey=http://rpm.arvados.org/CentOS/RPM-GPG-KEY-curoverse
The Curoverse signing key fingerprint is
pub 2048R/1078ECD7 2010-11-15 Curoverse, Inc Automatic Signing Key
Key fingerprint = B2DA 2991 656E B4A5 0314 CA2B 5716 5911 1078 ECD7 sub 2048R/5A8C5A93 2010-11-15
Packages are available for Debian 8 (“jessie”), Debian 9 (“stretch”), Ubuntu 14.04 (“trusty”), Ubuntu 16.04 (“xenial”) and Ubuntu 18.04 (“bionic”).
First, register the Curoverse signing key in apt’s database:
~$ sudo /usr/bin/apt-key adv --keyserver pool.sks-keyservers.net --recv 1078ECD7
Configure apt to retrieve packages from the Arvados package repository. This command depends on your OS vendor and version:
|Debian 8 (“jessie”)||
|Debian 9 (“stretch”)||
|Ubuntu 14.04 (“trusty”)1||
|Ubuntu 16.04 (“xenial”)1||
|Ubuntu 18.04 (“bionic”)1||
1 Arvados packages for Ubuntu may depend on third-party packages in Ubuntu’s “universe” repository. If you’re installing on Ubuntu, make sure you have the universe sources uncommented in
Retrieve the package list:
~$ sudo apt-get update
Each Arvados installation should have a globally unique identifier, which is a unique 5-character lowercase alphanumeric string. For testing purposes, here is one way to make a random 5-character string:
~$ tr -dc 0-9a-z </dev/urandom | head -c5; echo
You may also use a different method to pick the unique identifier. The unique identifier will be part of the hostname of the services in your Arvados cluster. The rest of this documentation will refer to it as your
There are six public-facing services that require an SSL certificate. If you do not have official SSL certificates, you can use self-signed certificates.
Most Arvados clients and services will accept self-signed certificates when the
ARVADOS_API_HOST_INSECURE environment variable is set to
true. However, web browsers generally do not make it easy for users to accept self-signed certificates from Web sites.
Users who log in through Workbench will visit at least three sites: the SSO server, the API server, and Workbench itself. When a browser visits each of these sites, it will warn the user if the site uses a self-signed certificate, and the user must accept it before continuing. This procedure usually only needs to be done once in a browser.
In sum, Workbench will be much less pleasant to use in a cluster that uses self-signed certificates. You should avoid using self-signed certificates unless you plan to deploy a cluster without Workbench; you are deploying only to evaluate Arvados as an individual system administrator; or you can push configuration to users’ browsers to trust your self-signed certificates.
By convention, we use the following hostname pattern:
|Arvados Git server||git.
|Arvados Keepproxy server||keep.
|Arvados Keep-web server||download.
|Arvados SSO Server||auth.your.domain|
|Arvados Websockets endpoint||ws.
The content of this documentation is licensed under the
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.