Prerequisites

Supported Cloud and HPC platforms

Arvados can run in a variety of configurations. For compute scheduling, Arvados supports HPC clusters using slurm, and supports elastic cloud computing on AWS, Google and Azure. For storage, Arvados can store blocks on regular file systems such as ext4 or xfs, on network file systems such as GPFS, or object storage such as Azure blob storage, Amazon S3, and other object storage that supports the S3 API including Google Cloud Storage and Ceph.

Hardware (or virtual machines)

This guide assumes you have seven systems available in the same network subnet:

Function Number of nodes
Arvados API, Crunch dispatcher, Git, Websockets and Workbench 1
Arvados Compute node 1
Arvados Keepproxy and Keep-web server 1
Arvados Keepstore servers 2
Arvados Shell server 1
Arvados SSO server 1

The number of Keepstore, shell and compute nodes listed above is a minimum. In a real production installation, you will likely run many more of each of those types of nodes. In such a scenario, you would probably also want to dedicate a node to the Workbench server and Crunch dispatcher, respectively. For performance reasons, you may want to run the database server on a separate node as well.

Supported GNU/Linux distributions

Distribution State Last supported version
CentOS 7 Supported Latest
Debian 8 (“jessie”) Supported Latest
Debian 9 (“stretch”) Supported Latest
Ubuntu 14.04 (“trusty”) Supported Latest
Ubuntu 16.04 (“xenial”) Supported Latest
Ubuntu 18.04 (“bionic”) Supported Latest
Ubuntu 12.04 (“precise”) EOL 8ed7b6dd5d4df93a3f37096afe6d6f81c2a7ef6e (2017-05-03)
Debian 7 (“wheezy”) EOL 997479d1408139e96ecdb42a60b4f727f814f6c9 (2016-12-28)
CentOS 6 EOL 997479d1408139e96ecdb42a60b4f727f814f6c9 (2016-12-28)

Arvados package repositories

On any host where you install Arvados software, you’ll need to set up an Arvados package repository. They’re available for several popular distributions.

CentOS

Packages are available for CentOS 7. To install them with yum, save this configuration block in /etc/yum.repos.d/arvados.repo:

[arvados]
name=Arvados
baseurl=http://rpm.arvados.org/CentOS/$releasever/os/$basearch/
gpgcheck=1
gpgkey=http://rpm.arvados.org/CentOS/RPM-GPG-KEY-curoverse

The Curoverse signing key fingerprint is


pub  2048R/1078ECD7 2010-11-15 Curoverse, Inc Automatic Signing Key 
      Key fingerprint = B2DA 2991 656E B4A5 0314  CA2B 5716 5911 1078 ECD7
sub  2048R/5A8C5A93 2010-11-15

Debian and Ubuntu

Packages are available for Debian 8 (“jessie”), Debian 9 (“stretch”), Ubuntu 14.04 (“trusty”), Ubuntu 16.04 (“xenial”) and Ubuntu 18.04 (“bionic”).

First, register the Curoverse signing key in apt’s database:

~$ sudo /usr/bin/apt-key adv --keyserver pool.sks-keyservers.net --recv 1078ECD7

Configure apt to retrieve packages from the Arvados package repository. This command depends on your OS vendor and version:

OS version Command
Debian 8 (“jessie”) echo "deb http://apt.arvados.org/ jessie main" | sudo tee /etc/apt/sources.list.d/arvados.list
Debian 9 (“stretch”) echo "deb http://apt.arvados.org/ stretch main" | sudo tee /etc/apt/sources.list.d/arvados.list
Ubuntu 14.04 (“trusty”)1 echo "deb http://apt.arvados.org/ trusty main" | sudo tee /etc/apt/sources.list.d/arvados.list
Ubuntu 16.04 (“xenial”)1 echo "deb http://apt.arvados.org/ xenial main" | sudo tee /etc/apt/sources.list.d/arvados.list
Ubuntu 18.04 (“bionic”)1 echo "deb http://apt.arvados.org/ bionic main" | sudo tee /etc/apt/sources.list.d/arvados.list

Note:

1 Arvados packages for Ubuntu may depend on third-party packages in Ubuntu’s “universe” repository. If you’re installing on Ubuntu, make sure you have the universe sources uncommented in /etc/apt/sources.list.

Retrieve the package list:

~$ sudo apt-get update

A unique identifier

Each Arvados installation should have a globally unique identifier, which is a unique 5-character lowercase alphanumeric string. For testing purposes, here is one way to make a random 5-character string:

~$ tr -dc 0-9a-z </dev/urandom | head -c5; echo

You may also use a different method to pick the unique identifier. The unique identifier will be part of the hostname of the services in your Arvados cluster. The rest of this documentation will refer to it as your uuid_prefix.

SSL certificates

There are six public-facing services that require an SSL certificate. If you do not have official SSL certificates, you can use self-signed certificates.

Note:

Most Arvados clients and services will accept self-signed certificates when the ARVADOS_API_HOST_INSECURE environment variable is set to true. However, web browsers generally do not make it easy for users to accept self-signed certificates from Web sites.

Users who log in through Workbench will visit at least three sites: the SSO server, the API server, and Workbench itself. When a browser visits each of these sites, it will warn the user if the site uses a self-signed certificate, and the user must accept it before continuing. This procedure usually only needs to be done once in a browser.

After that’s done, Workbench includes JavaScript clients for other Arvados services. Users are usually not warned if these client connections are refused because the server uses a self-signed certificate, and it is especially difficult to accept those cerficiates:

  • JavaScript connects to the Websockets server to provide incremental page updates and view logs from running jobs.
  • JavaScript connects to the API and Keepproxy servers to upload local files to collections.
  • JavaScript connects to the Keep-web server to download log files.

In sum, Workbench will be much less pleasant to use in a cluster that uses self-signed certificates. You should avoid using self-signed certificates unless you plan to deploy a cluster without Workbench; you are deploying only to evaluate Arvados as an individual system administrator; or you can push configuration to users’ browsers to trust your self-signed certificates.

By convention, we use the following hostname pattern:

Function Hostname
Arvados API uuid_prefix.your.domain
Arvados Git server git.uuid_prefix.your.domain
Arvados Keepproxy server keep.uuid_prefix.your.domain
Arvados Keep-web server download.uuid_prefix.your.domain
and
*.collections.uuid_prefix.your.domain or
*--collections.uuid_prefix.your.domain or
collections.uuid_prefix.your.domain (see the keep-web install docs)
Arvados SSO Server auth.your.domain
Arvados Websockets endpoint ws.uuid_prefix.your.domain
Arvados Workbench workbench.uuid_prefix.your.domain

Previous: Arvados on Kubernetes Next: Choosing which components to install

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.