Multi-Host Arvados

  1. Introduction
  2. Prerequisites and planning
  3. Download the installer
  4. Initialize the installer
  5. Set up your infrastructure
    1. Create AWS infrastructure with Terraform
    2. Create required infrastructure manually
  6. Edit local.params
  7. Configure Keep storage
  8. Choose the SSL configuration
    1. Using a Let’s Encrypt certificates
    2. Bring your own certificates
  9. Create a compute image
  10. Begin installation
  11. Further customization of the installation
  12. Confirm the cluster is working
    1. Debugging issues
    2. Iterating on config changes
    3. Common problems and solutions
  13. Initial user and login
  14. After the installation


This multi host installer is the recommendend way to set up a production Arvados cluster. These instructions include specific details for installing on Amazon Web Services (AWS), which are marked as “AWS specific”. However with additional customization the installer can be used as a template for deployment on other cloud provider or HPC systems.

Prerequisites and planning

Cluster ID and base domain

Choose a 5-character cluster identifier that will represent the cluster. Here are guidelines on choosing a cluster identifier . Only lowercase letters and digits 0-9 are allowed. Examples will use xarv1 or ${CLUSTER}, you should substitute the cluster id you have selected.

Determine the base domain for the cluster. This will be referred to as ${DOMAIN}.

For example, if CLUSTER is xarv1 and DOMAIN is, then controller.${CLUSTER}.${DOMAIN} means

DNS hostnames for each service

You will need a DNS entry for each service. When using the Terraform script to set up your infrastructure, these domains will be created automatically using AWS Route 53.

In the default configuration these are:

  1. controller.${CLUSTER}.${DOMAIN}
  2. ws.${CLUSTER}.${DOMAIN}
  3. keep0.${CLUSTER}.${DOMAIN}
  4. keep1.${CLUSTER}.${DOMAIN}
  5. keep.${CLUSTER}.${DOMAIN}
  6. download.${CLUSTER}.${DOMAIN}
  7. *.collections.${CLUSTER}.${DOMAIN} — important note, this must be a wildcard DNS, resolving to the keepweb service
  8. workbench.${CLUSTER}.${DOMAIN}
  9. workbench2.${CLUSTER}.${DOMAIN}
  10. webshell.${CLUSTER}.${DOMAIN}
  11. shell.${CLUSTER}.${DOMAIN}

For more information, see DNS entries and TLS certificates.

Download the installer

This is a package-based installation method, however the installation script is currently distributed in source form via git. We recommend checking out the git tree on your local workstation, not directly on the target(s) where you want to install and run Arvados.

git clone
cd arvados
git checkout 2.5-release
cd tools/salt-install

The and scripts will help you deploy Arvados by preparing your environment to be able to run the installer, then running it. The actual installer is located in the arvados-formula git repository and will be cloned during the running of the script. The installer is built using Saltstack and performs the install using masterless mode.

Initialize the installer

Replace “xarv1” with the cluster id you selected earlier.

This creates a git repository in ~/setup-arvados-xarv1. The will record all the configuration changes you make, as well as using git push to synchronize configuration edits if you have multiple nodes.

Important! Once you have initialized the installer directory, all further commands must be run with ~/setup-arvados-${CLUSTER} as the current working directory.

Using Terraform (AWS specific)

If you are going to use Terraform to set up the infrastructure on AWS, you first need to install the Terraform CLI and the AWS CLI tool. Then you can initialize the installer.

./ initialize ~/setup-arvados-${CLUSTER} multiple_hosts multi_host/aws terraform/aws
cd ~/setup-arvados-${CLUSTER}

Without Terraform

./ initialize ~/setup-arvados-${CLUSTER} multiple_hosts multi_host/aws
cd ~/setup-arvados-${CLUSTER}

Set up your infrastructure

  1. Create AWS infrastructure with Terraform
  2. Create required infrastructure manually

Create AWS infrastructure with Terraform (AWS specific)

We provide a set of Terraform code files that you can run to create the necessary infrastructure on Amazon Web Services.

These files are located in the terraform installer directory and are divided in three sections:

  1. The terraform/vpc/ subdirectory controls the network related infrastructure of your cluster, including firewall rules and split-horizon DNS resolution.
  2. The terraform/data-storage/ subdirectory controls the stateful part of your cluster, currently only sets up the S3 bucket for holding the Keep blocks and in the future it’ll also manage the database service.
  3. The terraform/services/ subdirectory controls the hosts that will run the different services on your cluster, makes sure that they have the required software for the installer to do its job.

Software requirements & considerations


The Terraform state files (that keep crucial infrastructure information from the cloud) will be saved inside each subdirectory, under the terraform.tfstate name. These will be committed to the git repository used to coordinate deployment. It is very important to keep this git repository secure, only sysadmins that will be responsible for maintaining your Arvados cluster should have access to it.

Terraform code configuration

Each section described above contain a terraform.tfvars file with some configuration values that you should set before applying each configuration. You should set the cluster prefix and domain name in vpc/terraform.tfvars:

region_name = "us-east-1"
# cluster_name = "xarv1"
# domain_name = ""

If you don’t set the variables vpc/terraform.tfvars file, you will be asked to re-enter these parameters every time you run Terraform.

The data-storage/terraform.tfvars and services/terraform.tfvars let you configure the location of your ssh public key (default ~/.ssh/ and the instance type to use (default m5a.large).

Create the infrastructure

Build the infrastructure by running ./ terraform. The last stage will output the information needed to set up the cluster’s domain and continue with the installer. for example:

$ ./ terraform
Apply complete! Resources: 16 added, 0 changed, 0 destroyed.


arvados_sg_id = "sg-02f999a99973999d7"
arvados_subnet_id = "subnet-01234567abc"
cluster_name = "xarv1"
compute_subnet_id = "subnet-abcdef12345"
deploy_user = "admin"
domain_name = ""
letsencrypt_iam_access_key_id = "AKAA43MAAAWAKAADAASD"
private_ip = {
  "controller" = ""
  "keep0" = ""
  "keep1" = ""
  "keepproxy" = ""
  "shell" = ""
  "workbench" = ""
public_ip = {
  "controller" = ""
  "keep0" = ""
  "keep1" = ""
  "keepproxy" = ""
  "shell" = ""
  "workbench" = ""
region_name = "us-east-1"
route53_dns_ns = tolist([
vpc_cidr = ""
vpc_id = "vpc-0999994998399923a"
letsencrypt_iam_secret_access_key = "XXXXXSECRETACCESSKEYXXXX"

Additional DNS configuration

Once Terraform has completed, the infrastructure for your Arvados cluster is up and running. One last piece of DNS configuration is required.

The domain names for your cluster (e.g.: are managed via Route 53 and the TLS certificates will be issued using Let’s Encrypt .

You need to configure the parent domain to delegate to the newly created zone. In other words, you need to configure ${DOMAIN} (e.g. “”) to delegate the subdomain ${CLUSTER}.${DOMAIN} (e.g. “”) to the nameservers for the Arvados hostname records created by Terraform. You do this by creating a NS record on the parent domain that refers to the name servers listed in the Terraform output parameter route53_dns_ns.

If your parent domain is also controlled by Route 53, the process will be like this:

  1. Log in to the AWS Console and navigate to the service page for Route 53
  2. Go to the list of Hosted zones and click on the zone for the parent domain
  3. Click on Create record
  4. For Record name put the cluster id
  5. For Record type choose NS - Name servers for a hosted zone
  6. For Value add the values from Terraform output parameter route53_dns_ns, one hostname per line, with punctuation (quotes and commas) removed.
  7. Click Create records

If the parent domain is controlled by some other service, follow the guide for the the appropriate service.

Other important output parameters

The certificates will be requested from Let’s Encrypt when you run the installer.

  • vpc_cidr will be used to set CLUSTER_INT_CIDR
  • You’ll also need compute_subnet_id and arvados_sg_id to set DriverParameters.SubnetID and DriverParameters.SecurityGroupIDs in local_config_dir/pillars/arvados.sls and when you create a compute image.

You can now proceed to edit local.params.

Create required infrastructure manually

If you will be setting up infrastructure without using the provided Terraform script, here are the recommendations you will need to consider.

Virtual Private Cloud (AWS specific)

We recommend setting Arvados up in its own Virtual Private Cloud

When you do so, you need to configure a couple of additional things:

  1. Create a subnet for the compute nodes
  2. You should set up a security group which allows SSH access
  3. Make sure to add a VPC S3 endpoint

S3 Bucket (AWS specific)

We recommend creating an S3 bucket for data storage named ${CLUSTER}-nyw5e-000000000000000-volume. We recommend creating an IAM role called ${CLUSTER}-keepstore-00-iam-role with a policy that can read, write, list and delete objects in the bucket . With the example cluster id xarv1 the bucket would be called xarv1-nyw5e-000000000000000-volume and the role would be called xarv1-keepstore-00-iam-role.

These names are recommended because they are default names used in the configuration template. If you use different names, you will need to edit the configuration template later.

Required hosts

You will need to allocate several hosts (physical or virtual machines) for the fixed infrastructure of the Arvados cluster. These machines should have at least 2 cores and 8 GiB of RAM, running a supported Linux distribution.

Supported Linux Distributions
CentOS 7
Debian 11 (“bullseye”)
Debian 10 (“buster”)
Ubuntu 20.04 (“focal”)
Ubuntu 18.04 (“bionic”)

Arvados packages are published for current Debian releases (until the EOL date), current Ubuntu LTS releases (until the end of standard support), and the latest version of CentOS.

Allocate the following hosts as appropriate for your site. On AWS you may choose to do it manually with the AWS console, or using a DevOps tool such as CloudFormation or Terraform. With the exception of “keep0” and “keep1”, all of these hosts should have external (public) IP addresses if you intend for them to be accessible outside of the private network or VPC.

The installer will set up the Arvados services on your machines. Here is the default assignment of services to machines:

  1. API node
    1. postgresql server
    2. arvados api server
    3. arvados controller (recommendend hostname controller.${CLUSTER}.${DOMAIN})
    4. arvados websocket (recommendend hostname ws.${CLUSTER}.${DOMAIN})
    5. arvados cloud dispatcher
    6. arvados keepbalance
  2. KEEPSTORE nodes (at least 2)
    1. arvados keepstore (recommendend hostnames keep0.${CLUSTER}.${DOMAIN} and keep1.${CLUSTER}.${DOMAIN})
  3. KEEPPROXY node
    1. arvados keepproxy (recommendend hostname keep.${CLUSTER}.${DOMAIN})
    2. arvados keepweb (recommendend hostname download.${CLUSTER}.${DOMAIN} and *.collections.${CLUSTER}.${DOMAIN})
  4. WORKBENCH node
    1. arvados workbench (recommendend hostname workbench.${CLUSTER}.${DOMAIN})
    2. arvados workbench2 (recommendend hostname workbench2.${CLUSTER}.${DOMAIN})
    3. arvados webshell (recommendend hostname webshell.${CLUSTER}.${DOMAIN})
  5. SHELL node (optional)
    1. arvados shell (recommended hostname shell.${CLUSTER}.${DOMAIN})

When using the database installed by Arvados (and not an external database), the database is stored under /var/lib/postgresql. Arvados logs are also kept in /var/log and /var/www/arvados-api/shared/log. Accordingly, you should ensure that the disk partition containing /var has adequate storage for your planned usage. We suggest starting with 50GiB of free space on the database host.

Additional prerequisites when preparing machines to run the installer

  1. From the account where you are performing the install, passwordless ssh to each machine
    This means the client’s public key should added to ~/.ssh/authorized_keys on each node.
  2. Passwordless sudo access on the account on each machine you will ssh in to
    This usually means adding the account to the sudo group and having a rule like this in /etc/sudoers.d/arvados_passwordless that allows members of group sudo to execute any command without entering a password.
  3. git installed on each machine
  4. Port 443 reachable by clients

(AWS specific) The machine that runs the arvados cloud dispatcher will need an IAM role that allows it to manage EC2 instances.

If your infrastructure differs from the setup proposed above (ie, different hostnames), you can still use the installer, but additional customization may be necessary .

Edit local.params

This can be found wherever you choose to initialize the install files (~/setup-arvados-xarv1 in these examples).

  1. Set CLUSTER to the 5-character cluster identifier (e.g “xarv1”)
  2. Set DOMAIN to the base DNS domain of the environment, e.g. “”
  3. Set the *_INT_IP variables with the internal (private) IP addresses of each host. Since services share hosts, some hosts are the same. See note about /etc/hosts
  4. Edit CLUSTER_INT_CIDR, this should be the CIDR of the private network that Arvados is running on, e.g. the VPC.
    CIDR stands for “Classless Inter-Domain Routing” and describes which portion of the IP address that refers to the network. For example means that the first 24 bits are the network (192.168.3) and the last 8 bits are a specific host on that network.
    AWS Specific: Go to the AWS console and into the VPC service, there is a column in this table view of the VPCs that gives the CIDR for the VPC (IPv4 CIDR).
  5. Set INITIAL_USER_EMAIL to your email address, as you will be the first admin user of the system.
  6. Set each KEY / TOKEN / PASSWORD to a random string. You can use generate-tokens
    $ ./ generate-tokens
  7. Set DATABASE_PASSWORD to a random string (unless you already have a database then you should set it to that database’s password)
    Important! If this contains any non-alphanumeric characters, in particular ampersand (‘&’), it is necessary to add backslash quoting.
    For example, if the password is Lq&MZ<V']d?j
    With backslash quoting the special characters it should appear like this in local.params:

Note on /etc/hosts

Because Arvados services are typically accessed by external clients, they are likely to have both a public IP address and a internal IP address.

On cloud providers such as AWS, sending internal traffic to a service’s public IP address can incur egress costs and throttling. Thus it is very important for internal traffic to stay on the internal network. The installer implements this by updating /etc/hosts on each node to associate each service’s hostname with the internal IP address, so that when Arvados services communicate with one another, they always use the internal network address. This is NOT a substitute for DNS, you still need to set up DNS names for all of the services that have public IP addresses (it does, however, avoid a complex “split-horizon” DNS configuration).

It is important to be aware of this because if you mistype the IP address for any of the *_INT_IP variables, hosts may unexpectedly fail to be able to communicate with one another. If this happens, check and edit as necessary the file /etc/hosts on the host that is failing to make an outgoing connection.

Configure Keep storage

The multi_host/aws template uses S3 for storage. Arvados also supports filesystem storage and Azure blob storage . Keep storage configuration can be found in in the arvados.cluster.Volumes section of local_config_dir/pillars/arvados.sls.

Object storage in S3 (AWS Specific)

Open local_config_dir/pillars/arvados.sls and edit as follows:

  1. In the arvados.cluster.Volumes.DriverParameters section, set Region to the appropriate AWS region (e.g. ‘us-east-1’)

If followed the recommendend naming scheme for both the bucket and role (or used the provided Terraform script), you’re done.

If you did not follow the recommendend naming scheme for either the bucket or role, you’ll need to update these parameters as well:

  1. Set Bucket to the value of keepstore bucket you created earlier
  2. Set IAMRole to keepstore role you created earlier

Choose the SSL/TLS configuration (SSL_MODE)

Arvados requires a valid TLS certificate to work correctly. This installer supports these options:

  1. lets-encrypt: automatically obtain and install an SSL certificates for your hostnames
  2. bring-your-own: supply your own certificates in the certs directory

Using a Let’s Encrypt certificate

In the default configuration, this installer gets a valid certificate via Let’s Encrypt. If you have the CLUSTER.DOMAIN domain in a route53 zone, you can set USE_LETSENCRYPT_ROUTE53 to YES and supply appropriate credentials so that Let’s Encrypt can use dns-01 validation to get the appropriate certificates.


Please note that when using AWS, EC2 instances can have a default hostname that ends with Let’s Encrypt has a blacklist of domain names for which it will not issue certificates, and that blacklist includes the domain, which means the default hostname can not be used to get a certificate from Let’s Encrypt.

Bring your own certificates

To supply your own certificates, change the configuration like this:


You will need certificates for each DNS name and DNS wildcard previously listed in the DNS hostnames for each service .

To simplify certificate management, we recommend creating a single certificate for all of the hostnames, or creating a wildcard certificate that covers all possible hostnames (with the following patterns in subjectAltName):

(Replacing xarv1 with your own ${CLUSTER}.${DOMAIN})

Copy your certificates to the directory specified with the variable CUSTOM_CERTS_DIR in the remote directory where you copied the script. The provision script will find the certificates there.

The script expects cert/key files with these basenames (matching the role except for keepweb, which is split in both download / collections):

  1. controller
  2. websocket — note: corresponds to default domain ws.${CLUSTER}.${DOMAIN}
  3. keepproxy — note: corresponds to default domain keep.${CLUSTER}.${DOMAIN}
  4. download — Part of keepweb
  5. collections — Part of keepweb, must be a wildcard for *.collections.${CLUSTER}.${DOMAIN}
  6. workbench
  7. workbench2
  8. webshell

For example, for the keepproxy service the script will expect to find this certificate:


Make sure that all the FQDNs that you will use for the public-facing applications (API/controller, Workbench, Keepproxy/Keepweb) are reachable.

Note: because the installer currently looks for a different certificate file for each service, if you use a single certificate, we recommend creating a symlink for each certificate and key file to the primary certificate and key, e.g.

ln -s xarv1.crt ${CUSTOM_CERTS_DIR}/controller.crt
ln -s xarv1.key ${CUSTOM_CERTS_DIR}/controller.key
ln -s xarv1.crt ${CUSTOM_CERTS_DIR}/keepproxy.crt
ln -s xarv1.key ${CUSTOM_CERTS_DIR}/keepproxy.key

All certificate files will be used by nginx. You may need to include intermediate certificates in your certificate files. See the nginx documentation for more details.

Configure your authentication provider (optional, recommended)

By default, the installer will use the “Test” provider, which is a list of usernames and cleartext passwords stored in the Arvados config file. This is low security configuration and you are strongly advised to configure one of the other supported authentication methods .

Using an external database (optional)

The standard behavior of the installer is to install and configure PostgreSQL for use by Arvados. You can optionally configure it to use a separately managed database instead.

Arvados requires a database that is compatible with PostgreSQL 9.5 or later. For example, Arvados is known to work with Amazon Aurora (note: even idle, Arvados services will periodically poll the database, so we strongly advise using “provisioned” mode).

  1. In local.params, remove ‘database’ from the list of roles assigned to the controller node:
  2. In local.params, set DATABASE_INT_IP to the database endpoint (can be a hostname, does not have to be an IP address).
  3. In local.params, set DATABASE_PASSWORD to the correct value. See the previous section describing correct quoting
  4. In local_config_dir/pillars/arvados.sls you may need to adjust the database name and user. This can be found in the section arvados.cluster.database.

Further customization of the installation (optional)

If you are installing on AWS and have followed all of the naming conventions recommend in this guide, you probably don’t need to do any further customization.

If you are installing on a different cloud provider or on HPC, other changes may require editing the Saltstack pillars and states files found in local_config_dir. In particular, local_config_dir/pillars/arvados.sls contains the template (in the arvados.cluster section) used to produce the Arvados configuration file that is distributed to all the nodes. Consult the Configuration reference for a comprehensive list of configuration keys.

Any extra Salt “state” files you add under local_config_dir/states will be added to the Salt run and applied to the hosts.

Configure compute nodes

If you will use fixed compute nodes with an HPC scheduler such as SLURM or LSF, you will need to Set up your compute nodes with Docker or Set up your compute nodes with Singularity.

On cloud installations, containers are dispatched in Docker daemons running in the compute instances, which need some additional setup.

Build the compute image

Follow the instructions to build a cloud compute node image using the compute image builder script found in arvados/tools/compute-images in your Arvados clone from step 3.

Configure the compute image

Once the image has been created, open local_config_dir/pillars/arvados.sls and edit as follows (AWS specific settings described here, other cloud providers will have similar settings in their respective configuration section):

  1. In the arvados.cluster.Containers.CloudVMs section:
    1. Set ImageID to the AMI produced by Packer
    2. Set DriverParameters.Region to the appropriate AWS region
    3. Set DriverParameters.AdminUsername to the admin user account on the image
    4. Set the DriverParameters.SecurityGroupIDs list to the VPC security group which you set up to allow SSH connections to these nodes
    5. Set DriverParameters.SubnetID to the value of SubnetId of your VPC
  2. Update arvados.cluster.Containers.DispatchPrivateKey and paste the contents of the ~/.ssh/id_dispatcher file you generated in an earlier step.
  3. Update arvados.cluster.InstanceTypes as necessary. The example instance types are for AWS, other cloud providers will of course have different instance types with different names and specifications.
    (AWS specific) If m5/c5 node types are not available, replace them with m4/c4. You’ll need to double check the values for Price and IncludedScratch/AddedScratch for each type that is changed.

Begin installation

At this point, you are ready to run the installer script in deploy mode that will conduct all of the Arvados installation.

Run this in the ~/arvados-setup-xarv1 directory:

./ deploy

This will install and configure Arvados on all the nodes. It will take a while and produce a lot of logging. If it runs into an error, it will stop.

Confirm the cluster is working

When everything has finished, you can run the diagnostics.

Depending on where you are running the installer, you need to provide -internal-client or -external-client.

If you are running the diagnostics from one of the Arvados machines inside the private network, you want -internal-client .

You are an “external client” if you running the diagnostics from your workstation outside of the private network.

./ diagnostics (-internal-client|-external-client)

Debugging issues

The installer records log files for each deployment.

Most service logs go to /var/log/syslog.

The logs for Rails API server and for Workbench can be found in


on the appropriate instances.

Workbench 2 is a client-side Javascript application. If you are having trouble loading Workbench 2, check the browser’s developer console (this can be found in “Tools → Developer Tools”).

Iterating on config changes

You can iterate on the config and maintain the cluster by making changes to local.params and local_config_dir and running deploy again.

If you are debugging a configuration issue on a specific node, you can speed up the cycle a bit by deploying just one node:

./ deploy

However, once you have a final configuration, you should run a full deploy to ensure that the configuration has been synchronized on all the nodes.

Common problems and solutions

PG::UndefinedTable: ERROR: relation \“api_clients\” does not exist

The arvados-api-server package sets up the database as a post-install script. If the database host or password wasn’t set correctly (or quoted correctly) at the time that package is installed, it won’t be able to set up the database.

This will manifest as an error like this:

#<ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR:  relation \"api_clients\" does not exist

If this happens, you need to

1. correct the database information
2. run ./ deploy to update the configuration on the API/controller node
3. Log in to the API/controller server node, then run this command to re-run the post-install script, which will set up the database:

dpkg-reconfigure arvados-api-server

4. Re-run ./ deploy again to synchronize everything, and so that the install steps that need to contact the API server are run successfully.

Missing ENA support (AWS Specific)

If the AMI wasn’t built with ENA (extended networking) support and the instance type requires it, it’ll fail to start. You’ll see an error in syslog on the node that runs arvados-dispatch-cloud. The solution is to build a new AMI with —aws-ena-support true

Initial user and login

At this point you should be able to log into the Arvados cluster. The initial URL will be


If you did not configure a different authentication provider you will be using the “Test” provider, and the provision script creates an initial user for testing purposes. This user is configured as administrator of the newly created cluster. It uses the values of INITIAL_USER and INITIAL_USER_PASSWORD the local.params file.

If you did configure a different authentication provider, the first user to log in will automatically be given Arvados admin privileges.

After the installation

As part of the operation of, it automatically creates a git repository with your configuration templates. You should retain this repository but be aware that it contains sensitive information (passwords and tokens used by the Arvados services as well as cloud credentials if you used Terraform to create the infrastructure).

As described in Iterating on config changes you may use deploy to re-run the Salt to deploy configuration changes and upgrades. However, be aware that the configuration templates created for you by are a snapshot which are not automatically kept up to date.

When deploying upgrades, consult the Arvados upgrade notes to see if changes need to be made to the configuration file template in local_config_dir/pillars/arvados.sls. To specify the version to upgrade to, set the VERSION parameter in local.params.

See also Maintenance and upgrading for more information.

Previous: Single host Arvados Next: Planning and prerequisites

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.