Single host Arvados

  1. Limitations of the single host install
  2. Prerequisites
  3. Download the installer
  4. Copy the configuration files
  5. Choose the SSL configuration
    1. Using a self-signed certificate
    2. Using a Let’s Encrypt certificate
    3. Bring your own certificate
  6. Further customization of the installation
  7. Run the provision.sh script
  8. Install the CA root certificate
  9. Initial user and login
  10. Test the installed cluster running a simple workflow
  11. After the installation

Limitations of the single host install

NOTE: The single host installation is a good choice for evaluating Arvados, but it is not recommended for production use.

Using the default configuration, this installation method has a number of limitations:

  • all services run on the same machine, and they will compete for resources. This includes any compute jobs.
  • it uses the local machine disk for Keep storage (under the /tmp directory). There may not be a lot of space available.
  • it installs the crunch-dispatch-local dispatcher, which has a limit of eight concurrent jobs. These jobs will be executed on the same machine that runs all the Arvados services and may well starve them of resources.

It is possible to start with the single host installation method and modify the Arvados configuration file later to address these limitations. E.g. switch to a different storage volume setup for Keep, and switch to the cloud dispatcher to provision compute resources dynamically.

Prerequisites

  • git
  • a dedicated (virtual) machine for your Arvados server with at least 2 cores and 8 GiB of RAM, running a supported Arvados distribution
  • a DNS hostname that resolves to the IP address of your Arvados server
  • ports 443, 8800-8805 need to be reachable from your client (configurable in local.params, see below)
  • port 80 needs to be reachable from everywhere on the internet (only when using Let’s Encrypt)
  • an SSL certificate matching the hostname in use (only when using bring your own certificate)

Download the installer

This procedure will install all the main Arvados components to get you up and running in a single host.

This is a package-based installation method, however the installation script is currently distributed in source form via git. We recommend checking out the git tree on your local workstation, not directly on the target(s) where you want to install and run Arvados.

git clone https://git.arvados.org/arvados.git
git checkout 2.4-release
cd arvados/tools/salt-install

The provision.sh script will help you deploy Arvados by preparing your environment to be able to run the installer, then running it. The actual installer is located in the arvados-formula git repository and will be cloned during the running of the provision.sh script. The installer is built using Saltstack and provision.sh performs the install using master-less mode.

Copy the configuration files

cp local.params.example.single_host_single_hostname local.params
cp -r config_examples/single_host/single_hostname local_config_dir

Edit the variables in the local.params file. Pay attention to the *_PORT, *_TOKEN and *_KEY variables. The SSL_MODE variable is discussed in the next section.

Choose the SSL configuration (SSL_MODE)

Arvados requires an SSL certificate to work correctly. This installer supports these options:

  • self-signed: let the installer create a self-signed certificate
  • lets-encrypt: automatically obtain and install an SSL certificate for your hostname
  • bring-your-own: supply your own certificate in the `certs` directory

Using a self-signed certificate

In the default configuration, this installer uses self-signed certificate(s):

SSL_MODE="self-signed"

When connecting to the Arvados web interface for the first time, you will need to accept the self-signed certificate as trusted to bypass the browser warnings. This can be a little tricky to do. Alternatively, you can also install the self-signed root certificate in your browser, see below.

Using a Let’s Encrypt certificate

To automatically get a valid certificate via Let’s Encrypt, change the configuration like this:

SSL_MODE="lets-encrypt"

The hostname for your Arvados cluster must be defined in HOSTNAME_EXT and resolve to the public IP address of your Arvados instance, so that Let’s Encrypt can validate the domainname ownership and issue the certificate.

When using AWS, EC2 instances can have a default hostname that ends with amazonaws.com. Let’s Encrypt has a blacklist of domain names for which it will not issue certificates, and that blacklist includes the amazonaws.com domain, which means the default hostname can not be used to get a certificate from Let’s Encrypt.

Bring your own certificate

To supply your own certificate, change the configuration like this:

SSL_MODE="bring-your-own"

Copy your certificate files to the directory specified with the variable CUSTOM_CERTS_DIR. The provision script will find it there. The certificate and its key need to be copied to a file named after HOSTNAME_EXT. For example, if HOSTNAME_EXT is defined as my-arvados.example.net, the script will look for

${CUSTOM_CERTS_DIR}/my-arvados.example.net.crt
${CUSTOM_CERTS_DIR}/my-arvados.example.net.key

All certificate files will be used by nginx. You may need to include intermediate certificates in your certificate file. See the nginx documentation for more details.

Further customization of the installation (modifying the salt pillars and states)

If you want or need further customization, you can edit the Saltstack pillars and states files. Pay particular attention to the pillars/arvados.sls one. Any extra state file you add under local_config_dir/states will be added to the salt run and applied to the host.

Run the provision.sh script

When you finished customizing the configuration, you are ready to copy the files to the target host where Arvados will be installed, and run the provision.sh script there:

scp -r provision.sh local* tests user@host:
ssh user@host sudo ./provision.sh

Wait for it to finish. The script will need 5 to 10 minutes to install and configure everything.

If everything goes OK, you’ll get final output that looks similar to this:

arvados: Succeeded: 151 (changed=36)
arvados: Failed:      0

Install the CA root certificate (SSL_MODE=self-signed only)

Arvados uses SSL to encrypt communications. The web interface uses AJAX which will silently fail if the certificate is not valid or signed by an unknown Certification Authority.

For this reason, the arvados-formula has a helper state to create a root certificate to authorize Arvados services. The provision.sh script will leave a copy of the generated CA’s certificate (arvados-snakeoil-ca.pem) in the script’s directory so you can add it to your workstation.

Installing the root certificate into your web browser will prevent security errors when accessing Arvados services with your web browser.

  1. Go to the certificate manager in your browser.
    • In Chrome, this can be found under “Settings → Advanced → Manage Certificates” or by entering chrome://settings/certificates in the URL bar.
    • In Firefox, this can be found under “Preferences → Privacy & Security” or entering about:preferences#privacy in the URL bar and then choosing “View Certificates…”.
  2. Select the “Authorities” tab, then press the “Import” button. Choose arvados-snakeoil-ca.pem

The certificate will be added under the “Arvados Formula”.

To access your Arvados instance using command line clients (such as arv-get and arv-put) without security errors, install the certificate into the OS certificate storage.

  • On Debian/Ubuntu:
cp arvados-root-cert.pem /usr/local/share/ca-certificates/
/usr/sbin/update-ca-certificates
  • On CentOS:
cp arvados-root-cert.pem /etc/pki/ca-trust/source/anchors/
/usr/bin/update-ca-trust

Initial user and login

At this point you should be able to log on to your new Arvados cluster. The workbench URL will be

  • https://HOSTNAME_EXT

By default, the provision script creates an initial user for testing purposes. This user is configured as administrator of the newly created cluster. The username, password and e-mail address for the initial user are configured in the local.params file. Log in with the e-mail address and password.

Test the installed cluster running a simple workflow

The provision.sh script saves a simple example test workflow in the /tmp/cluster_tests directory in the node. If you want to run it, just ssh to the node, change to that directory and run:

cd /tmp/cluster_tests
sudo ./run-test.sh

It will create a test user (by default, the same one as the admin user), upload a small workflow and run it. If everything goes OK, the output should similar to this (some output was shortened for clarity):

Creating Arvados Standard Docker Images project
Arvados project uuid is 'arva2-j7d0g-0prd8cjlk6kfl7y'
{
 ...
 "uuid":"arva2-o0j2j-n4zu4cak5iifq2a",
 "owner_uuid":"arva2-tpzed-000000000000000",
 ...
}
Creating initial user ('admin')
Setting up user ('admin')
{
 "items":[
  {
   ...
   "owner_uuid":"arva2-tpzed-000000000000000",
   ...
   "uuid":"arva2-o0j2j-1ownrdne0ok9iox"
  },
  {
   ...
   "owner_uuid":"arva2-tpzed-000000000000000",
   ...
   "uuid":"arva2-o0j2j-1zbeyhcwxc1tvb7"
  },
  {
   ...
   "email":"admin@arva2.arv.local",
   ...
   "owner_uuid":"arva2-tpzed-000000000000000",
   ...
   "username":"admin",
   "uuid":"arva2-tpzed-3wrm93zmzpshrq2",
   ...
  }
 ],
 "kind":"arvados#HashList"
}
Activating user 'admin'
{
 ...
 "email":"admin@arva2.arv.local",
 ...
 "username":"admin",
 "uuid":"arva2-tpzed-3wrm93zmzpshrq2",
 ...
}
Running test CWL workflow
INFO /usr/bin/cwl-runner 2.1.1, arvados-python-client 2.1.1, cwltool 3.0.20200807132242
INFO Resolved 'hasher-workflow.cwl' to 'file:///tmp/cluster_tests/hasher-workflow.cwl'
...
INFO Using cluster arva2 (https://arva2.arv.local:8443/)
INFO Upload local files: "test.txt"
INFO Uploaded to ea34d971b71d5536b4f6b7d6c69dc7f6+50 (arva2-4zz18-c8uvwqdry4r8jao)
INFO Using collection cache size 256 MiB
INFO [container hasher-workflow.cwl] submitted container_request arva2-xvhdp-v1bkywd58gyocwm
INFO [container hasher-workflow.cwl] arva2-xvhdp-v1bkywd58gyocwm is Final
INFO Overall process status is success
INFO Final output collection d6c69a88147dde9d52a418d50ef788df+123
{
    "hasher_out": {
        "basename": "hasher3.md5sum.txt",
        "class": "File",
        "location": "keep:d6c69a88147dde9d52a418d50ef788df+123/hasher3.md5sum.txt",
        "size": 95
    }
}
INFO Final process status is success

After the installation

Once the installation is complete, it is recommended to keep a copy of your local configuration files. Committing them to version control is a good idea.

Re-running the Salt-based installer is not recommended for maintaining and upgrading Arvados, please see Maintenance and upgrading for more information.


Previous: Arvados-in-a-box Next: Multi host Arvados

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.