Planning and prerequisites

  1. Introduction
  2. Provisioning Arvados with Saltstack
  3. The provisioning tool files and directories and directories
  4. Choose an Arvados installation configuration
    1. Further customization of the installation
  5. Dump the configuration files created with the provision script
  6. Add the Arvados formula to your Saltstack infrastructure


To ease the installation of the various Arvados components, we have developed a Saltstack ’s arvados-formula which can help you get an Arvados cluster up and running.

Saltstack is a Python-based, open-source software for event-driven IT automation, remote task execution, and configuration management. It can be used in a master/minion setup (where a master node orchestrates and coordinates the configuration of nodes in an infrastructure) or master-less, where Saltstack is run locally in a node, with no communication with a master node.

Similar to other configuration management tools like Puppet, Ansible or Chef, Saltstack uses files named states to describe the tasks that will be performed on a node to take it to a desired state, and pillars to configure variables passed to the states, adding flexibility to the tool.

You don’t need to be running a Saltstack infrastructure to install Arvados: we wrote a provisioning script that will take care of setting up Saltstack in the node/s where you want to install Arvados and run a master-less installer. Once Arvados is installed, you can either uninstall Saltstack and its files or you can keep them, to modify/maintain your Arvados installation in the future.

This is a package-based installation method.

Provisioning Arvados with Saltstack

The tools/salt-install directory in the Arvados git repository contains a script that you can run in the node/s where you want to install Arvados’ components (the script) and a few configuration examples for different setups, that you can use to customize your installation.

The script will help you deploy Arvados by preparing your environment to be able to run the installer, then running it. The actual installer is located at arvados-formula and will be cloned during the running of the script. The installer is built using Saltstack and performs the install using master-less mode.

After setting up a few variables in a config file and copying a directory from the examples (see below), you’ll be ready to run it and get Arvados deployed.

h2(#provisioning_tool_files and directories). The provisioning tool files and directories

The tools/salt-install directory contains the following elements:

  • The script itself. You don’t need to modify it.
  • A few local.params.* example files. You will need to copy one of these files to a file named local.params, which is the main configuration file for the script.
  • A few config_examples/* directories, with pillars and states templates. You need to copy one of these to a local_config_dir directory, which will be used by the script to setup your nodes.
  • A tests directory, with a simple workflow and arvados CLI commands you can run to tests your cluster is capable of running a CWL workflow, upload files and create a user.

Once you decide on an Arvados architecture you want to apply, you need to copy one of the example configuration files and directory, and edit them to suit your needs.

Ie., for a multi-hosts / multi-hostnames in AWS, you need to do this:

cp local.params.example.multiple_hosts local.params
cp -r config_examples/multi_host/aws local_config_dir

These local files will be preserved if you upgrade the repository.

Choose an Arvados installation configuration

The configuration examples provided with this installer are suitable to install Arvados with the following distribution of hosts/roles:

  • All roles on a single host, which can be done in two fashions:
    • Using a single hostname, assigning a different port (other than 443) for each user-facing service: This choice is easier to setup, but the user will need to know the port/s for the different services she wants to connect to. See Single host install using the script for more details.
    • Using multiple hostnames on the same IP: this setup involves a few extra steps but each service will have a meaningful hostname so it will make easier to access them later. See Single host install using the script for more details.
  • Roles distributed over multiple AWS instances, using multiple hostnames. This example can be adapted to use on-prem too. See Multiple hosts installation for more details.

Once you decide which of these choices you prefer, copy one of the example configuration files and directory, and edit them to suit your needs.

Ie, if you decide to install Arvados on a single host using multiple hostnames:

cp local.params.example.single_host_multiple_hostnames local.params
cp -r config_examples/single_host/multiple_hostnames local_config_dir

Edit the variables in the local.params file.

Further customization of the installation (modifying the salt pillars and states)

If you want or need further customization, you can edit the Saltstack pillars and states files. Pay particular attention to the pillars/arvados.sls one. Any extra state file you add under local_config_dir/states will be added to the salt run and applied to the host.

Dump the configuration files created with the provision script

As mentioned above, the script helps you create a set of configuration files to be used by the Saltstack arvados-formula and other helper formulas.

Is it possible you want to inspect these files before deploying them or use them within your existing Saltstack environment. In order to get a rendered version of these files, the script has a option, --dump-config, which takes a directory as mandatory parameter. When this option it used, the script will create the specified directory and write the pillars, states and tests files so you can inspect them.


./ --dump-config ./config_dump --role workbench

will dump the configuration files used to install a workbench node under the config_dump directory.

These files are also suitable to be used in your existing Saltstack environment (see below).

h2.(#add_formula_to_saltstack). Add the Arvados formula to your Saltstack infrastructure

If you already have a Saltstack environment you can add the arvados-formula to your Saltstack master and apply the corresponding states and pillars to the nodes on your infrastructure that will be used to run Arvados.

The --dump-config option described above writes a pillars/top.sls and salt/top.sls files that you can use as a guide to configure your infrastructure.

Previous: Arvados-in-a-box Next: Arvados in a VM with Vagrant

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.