This multi host installer is an AWS specific example that is generally useful, but will likely need to be adapted for your environment. The installer is highly configurable.
Prerequisites:
local.params
, see below)Planning:
We suggest distributing the Arvados components in the following way, creating at least 6 hosts:
If your infrastructure differs from the setup proposed above (ie, using RDS or an existing DB server), remember that you will need to edit the configuration files for the scripts so they work with your infrastructure.
This procedure will install all the main Arvados components to get you up and running in a single host.
This is a package-based installation method, however the installation script is currently distributed in source form via git
. We recommend checking out the git tree on your local workstation, not directly on the target(s) where you want to install and run Arvados.
git clone https://git.arvados.org/arvados.git
git checkout 2.4-release
cd arvados/tools/salt-install
The provision.sh
script will help you deploy Arvados by preparing your environment to be able to run the installer, then running it. The actual installer is located in the arvados-formula git repository and will be cloned during the running of the provision.sh
script. The installer is built using Saltstack and provision.sh
performs the install using master-less mode.
cp local.params.example.multiple_hosts local.params
cp -r config_examples/multi_host/aws local_config_dir
Edit the variables in the local.params file. Pay attention to the *_INT_IP, *_TOKEN and *_KEY variables. The SSL_MODE variable is discussed in the next section.
Arvados requires an SSL certificate to work correctly. This installer supports these options:
self-signed
: let the installer create self-signed certificateslets-encrypt
: automatically obtain and install an SSL certificates for your hostnamesbring-your-own
: supply your own certificates in the `certs` directoryTo make the installer use self-signed certificates, change the configuration like this:
SSL_MODE="self-signed"
When connecting to the Arvados web interface for the first time, you will need to accept the self-signed certificates as trusted to bypass the browser warnings. This can be a little tricky to do. Alternatively, you can also install the self-signed root certificate in your browser, see below.
In the default configuration, this installer gets a valid certificate via Let’s Encrypt. If you have the CLUSTER.DOMAIN domain in a route53 zone, you can set USE_LETSENCRYPT_ROUTE53 to YES and supply appropriate credentials so that Let’s Encrypt can use dns-01 validation to get the appropriate certificates.
SSL_MODE="lets-encrypt"
USE_LETSENCRYPT_ROUTE53="yes"
LE_AWS_REGION="us-east-1"
LE_AWS_ACCESS_KEY_ID="AKIABCDEFGHIJKLMNOPQ"
LE_AWS_SECRET_ACCESS_KEY="thisistherandomstringthatisyoursecretkey"
Please note that when using AWS, EC2 instances can have a default hostname that ends with amazonaws.com. Let’s Encrypt has a blacklist of domain names for which it will not issue certificates, and that blacklist includes the amazonaws.com domain, which means the default hostname can not be used to get a certificate from Let’s Encrypt.
To supply your own certificates, change the configuration like this:
SSL_MODE="bring-your-own"
Copy your certificates to the directory specified with the variable CUSTOM_CERTS_DIR
in the remote directory where you copied the provision.sh
script. The provision script will find the certificates there.
The script expects cert/key files with these basenames (matching the role except for keepweb, which is split in both download / collections):
E.g. for ‘keepproxy’, the script will look for
${CUSTOM_CERTS_DIR}/keepproxy.crt
${CUSTOM_CERTS_DIR}/keepproxy.key
Make sure that all the FQDNs that you will use for the public-facing applications (API/controller, Workbench, Keepproxy/Keepweb) are reachable.
All certificate files will be used by nginx. You may need to include intermediate certificates in your certificate files. See the nginx documentation for more details.
In a multi-host installation, containers are dispatched in docker daemons running in the compute instances, which need some special setup. We provide a compute image builder script that you can use to build a template image following these instructions. Once you have that image created, you will need to update the pillars/arvados.sls file with the AMI ID and the private ssh key for the dispatcher.
You will need further customization to suit your environment, which can be done editing the Saltstack pillars and states files. Pay particular attention to the pillars/arvados.sls file, where you will need to provide some information that describes your environment.
Any extra state file you add under local_config_dir/states will be added to the salt run and applied to the hosts.
A few Arvados nodes need to be installed in certain order. The required order is
When you finished customizing the configuration, you are ready to copy the files to the hosts and run the provision.sh
script. The script allows you to specify the role/s a node will have and it will install only the Arvados components required for such role. The general format of the command is:
scp -r provision.sh local* user@host:
ssh user@host sudo ./provision.sh --roles comma,separated,list,of,roles,to,apply
and wait for it to finish.
If everything goes OK, you’ll get some final lines stating something like:
arvados: Succeeded: 109 (changed=9)
arvados: Failed: 0
The distribution of role as described above can be applied running these commands:
scp -r provision.sh local* user@host:
ssh user@host sudo ./provision.sh --config local.params --roles database
scp -r provision.sh local* user@host:
ssh user@host sudo ./provision.sh --config local.params --roles api,controller,websocket,dispatcher,keepbalance
scp -r provision.sh local* user@host:
ssh user@host sudo ./provision.sh --config local.params --roles keepstore
scp -r provision.sh local* user@host:
ssh user@host sudo ./provision.sh --config local.params --roles workbench,workbench2,webshell
scp -r provision.sh local* user@host:
ssh user@host sudo ./provision.sh --config local.params --roles keepproxy,keepweb
scp -r provision.sh local* tests user@host:
ssh user@host sudo ./provision.sh --config local.params --roles shell
Arvados uses SSL to encrypt communications. The web interface uses AJAX which will silently fail if the certificate is not valid or signed by an unknown Certification Authority.
For this reason, the arvados-formula
has a helper state to create a root certificate to authorize Arvados services. The provision.sh
script will leave a copy of the generated CA’s certificate (arvados-snakeoil-ca.pem
) in the script’s directory so you can add it to your workstation.
Installing the root certificate into your web browser will prevent security errors when accessing Arvados services with your web browser.
chrome://settings/certificates
in the URL bar.about:preferences#privacy
in the URL bar and then choosing “View Certificates…”.arvados-snakeoil-ca.pem
The certificate will be added under the “Arvados Formula”.
To access your Arvados instance using command line clients (such as arv-get and arv-put) without security errors, install the certificate into the OS certificate storage.
cp arvados-root-cert.pem /usr/local/share/ca-certificates/
/usr/sbin/update-ca-certificates
cp arvados-root-cert.pem /etc/pki/ca-trust/source/anchors/
/usr/bin/update-ca-trust
At this point you should be able to log into the Arvados cluster. The initial URL will be:
or, in general, the url format will be:
<cluster>.<domain>
By default, the provision script creates an initial user for testing purposes. This user is configured as administrator of the newly created cluster.
Assuming you didn’t change these values in the local.params
file, the initial credentials are:
If you followed the instructions above, the provision.sh
script saves a simple example test workflow in the /tmp/cluster_tests
directory in the shell
node. If you want to run it, just ssh to the node, change to that directory and run:
cd /tmp/cluster_tests
sudo /run-test.sh
It will create a test user (by default, the same one as the admin user), upload a small workflow and run it. If everything goes OK, the output should similar to this (some output was shortened for clarity):
Creating Arvados Standard Docker Images project
Arvados project uuid is 'arva2-j7d0g-0prd8cjlk6kfl7y'
{
...
"uuid":"arva2-o0j2j-n4zu4cak5iifq2a",
"owner_uuid":"arva2-tpzed-000000000000000",
...
}
Uploading arvados/jobs' docker image to the project
2.1.1: Pulling from arvados/jobs
8559a31e96f4: Pulling fs layer
...
Status: Downloaded newer image for arvados/jobs:2.1.1
docker.io/arvados/jobs:2.1.1
2020-11-23 21:43:39 arvados.arv_put[32678] INFO: Creating new cache file at /home/vagrant/.cache/arvados/arv-put/c59256eda1829281424c80f588c7cc4d
2020-11-23 21:43:46 arvados.arv_put[32678] INFO: Collection saved as 'Docker image arvados jobs:2.1.1 sha256:0dd50'
arva2-4zz18-1u5pvbld7cvxuy2
Creating initial user ('admin')
Setting up user ('admin')
{
"items":[
{
...
"owner_uuid":"arva2-tpzed-000000000000000",
...
"uuid":"arva2-o0j2j-1ownrdne0ok9iox"
},
{
...
"owner_uuid":"arva2-tpzed-000000000000000",
...
"uuid":"arva2-o0j2j-1zbeyhcwxc1tvb7"
},
{
...
"email":"admin@arva2.arv.local",
...
"owner_uuid":"arva2-tpzed-000000000000000",
...
"username":"admin",
"uuid":"arva2-tpzed-3wrm93zmzpshrq2",
...
}
],
"kind":"arvados#HashList"
}
Activating user 'admin'
{
...
"email":"admin@arva2.arv.local",
...
"username":"admin",
"uuid":"arva2-tpzed-3wrm93zmzpshrq2",
...
}
Running test CWL workflow
INFO /usr/bin/cwl-runner 2.1.1, arvados-python-client 2.1.1, cwltool 3.0.20200807132242
INFO Resolved 'hasher-workflow.cwl' to 'file:///tmp/cluster_tests/hasher-workflow.cwl'
...
INFO Using cluster arva2 (https://arva2.arv.local:8443/)
INFO Upload local files: "test.txt"
INFO Uploaded to ea34d971b71d5536b4f6b7d6c69dc7f6+50 (arva2-4zz18-c8uvwqdry4r8jao)
INFO Using collection cache size 256 MiB
INFO [container hasher-workflow.cwl] submitted container_request arva2-xvhdp-v1bkywd58gyocwm
INFO [container hasher-workflow.cwl] arva2-xvhdp-v1bkywd58gyocwm is Final
INFO Overall process status is success
INFO Final output collection d6c69a88147dde9d52a418d50ef788df+123
{
"hasher_out": {
"basename": "hasher3.md5sum.txt",
"class": "File",
"location": "keep:d6c69a88147dde9d52a418d50ef788df+123/hasher3.md5sum.txt",
"size": 95
}
}
INFO Final process status is success
Once the installation is complete, it is recommended to keep a copy of your local configuration files. Committing them to version control is a good idea.
Re-running the Salt-based installer is not recommended for maintaining and upgrading Arvados, please see Maintenance and upgrading for more information.
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.