Arvados-in-a-box

Arvbox is a Docker-based self-contained development, demonstration and testing environment for Arvados. It is not intended for production use.

Requirements

  • Linux 3.x+ and Docker 1.10+
  • Minimum of 4 GiB of RAM + additional memory to run jobs
  • Minimum of 4 GiB of disk + storage for actual data

Quick start

$ curl -O https://git.arvados.org/arvados.git/blob_plain/refs/heads/main:/tools/arvbox/bin/arvbox
$ chmod +x arvbox
$ ./arvbox start localdemo

Arvados-in-a-box starting

Waiting for workbench2 websockets workbench webshell keep-web controller keepproxy api keepstore1 arv-git-httpd keepstore0 sdk vm ...

...

Your Arvados-in-a-box is ready!

$ ./arvbox adduser demouser demo@example.com
Password for demouser:
Added demouser

You will then need to install the arvbox root certificate . After that, you can now log in to Workbench as demouser with the password you selected.

Install root certificate

Arvbox creates root certificate to authorize Arvbox services. Installing the root certificate into your web browser will prevent security errors when accessing Arvbox services with your web browser. Every Arvbox instance generates a new root signing key.

Export the root certificate with this command:

$ ./arvbox root-cert
Certificate copied to /home/ubuntu/arvbox-root-cert.crt

Web Browser

Installing the root certificate into your web browser will prevent security errors when accessing Arvados services with your web browser.

Chrome

  1. Go to “Settings → Privacy and Security → Security → Manage Certificates” or enter chrome://settings/certificates in the URL bar.
  2. Click on the “Authorities” tab (it is not selected by default)
  3. Click on the “Import” button
  4. Choose arvbox-root-cert.crt
  5. Tick the checkbox next to “Trust this certificate for identifying websites”
  6. Hit OK
  7. The certificate should appear in the list of Authorities under “Arvados”

Firefox

  1. Go to “Preferences → Privacy & Security” or enter about:preferences#privacy in the URL bar
  2. Scroll down to the Certificates section
  3. Click on the button “View Certificates…”.
  4. Make sure the “Authorities” tab is selected
  5. Press the “Import…” button.
  6. Choose arvbox-root-cert.crt
  7. Tick the checkbox next to “Trust this CA to identify websites”
  8. Hit OK
  9. The certificate should appear in the list of Authorities under “Arvados”

Other browsers (Safari, etc)

The process will be similar to that of Chrome and Firefox, but the exact user interface will be different. If you can’t figure it out, try searching for “how do I install a custom certificate authority in (my browser)”.

Installation on Linux OS certificate storage

To access your Arvados instance using command line clients (such as arv-get and arv-put) without security errors, install the certificate into the OS certificate storage.

Debian/Ubuntu

Important the certificate file added to ca-certificates must have the extension .crt or it won’t be recognized.

cp arvbox-root-cert.crt /usr/local/share/ca-certificates/arvados-snakeoil-ca.crt
/usr/sbin/update-ca-certificates

Alma/CentOS/Red Hat/Rocky

cp arvbox-root-cert.crt /etc/pki/ca-trust/source/anchors/
/usr/bin/update-ca-trust

Usage

$ arvbox
Arvados-in-a-box             https://doc.arvados.org/install/arvbox.html

start|run <config> [tag]   start arvbox container
stop               stop arvbox container
restart <config>   stop, then run again
status             print some information about current arvbox
ip                 print arvbox docker container ip address
host               print arvbox published host
shell              enter shell as root
ashell             enter shell as 'arvbox'
psql               enter postgres console
open               open arvbox workbench in a web browser
root-cert          get copy of root certificate
update  <config>   stop, pull latest image, run
build   <config>   build arvbox Docker image
reboot  <config>   stop, build arvbox Docker image, run
rebuild <config>   build arvbox Docker image, no layer cache
checkpoint         create database backup
restore            restore checkpoint
hotreset           reset database and restart API without restarting container
reset              delete arvbox arvados data (be careful!)
destroy            delete all arvbox code and data (be careful!)
log <service>      tail log of specified service
ls <options>       list directories inside arvbox
cat <files>        get contents of files inside arvbox
pipe               run a bash script piped in from stdin
sv <start|stop|restart> <service>
                   change state of service inside arvbox
clone <from> <to>  clone dev arvbox
adduser <username> <email> [password]
                   add a user login
removeuser <username>
                   remove user login
listusers          list user logins

Configs

dev

Development configuration. Boots a complete Arvados environment inside the container. The “arvados” and “arvados-dev” code directories along data directories “postgres”, “var”, “passenger” and “gems” are bind mounted from the host file system for easy access and persistence across container rebuilds. Services are bound to the Docker container’s network IP address and can only be accessed on the local host.

In “dev” mode, you can override the default autogenerated settings of Rails projects by adding “application.yml.override” to any Rails project (api, workbench). This can be used to test out API server settings or point Workbench at an alternate API server.

localdemo

Demo configuration. Boots a complete Arvados environment inside the container. Unlike the development configuration, code directories are included in the demo image, and data directories are stored in a separate data volume container. Services are bound to the Docker container’s network IP address and can only be accessed on the local host.

test

Starts postgres and initializes the API server, then runs the Arvados test suite. Will pass command line arguments to test runner. Supports test runner interactive mode.

devenv

Starts a minimal container with no services and the host’s $HOME bind mounted inside the container, then enters an interactive login shell. Intended to make it convenient to use tools installed in arvbox that don’t require services.

publicdev

Publicly accessible development configuration. Similar to ‘dev’ except that service ports are published to the host’s IP address and can accessed by anyone who can connect to the host system. See below for more information. WARNING! The public arvbox configuration is NOT SECURE and must not be placed on a public IP address or used for production work.

publicdemo

Publicly accessible development configuration. Similar to ‘localdemo’ except that service ports are published to the host’s IP address and can accessed by anyone who can connect to the host system. See below for more information. WARNING! The public arvbox configuration is NOT SECURE and must not be placed on a public IP address or used for production work.

Environment variables

ARVBOX_DOCKER

The location of Dockerfile.base and associated files used by “arvbox build”.
default: result of $(readlink -f $(dirname $0)/../lib/arvbox/docker)

ARVBOX_CONTAINER

The name of the Docker container to manipulate.
default: arvbox

ARVBOX_BASE

The base directory to store persistent data for arvbox containers.
default: $HOME/.arvbox

ARVBOX_DATA

The base directory to store persistent data for the current container.
default: $ARVBOX_BASE/$ARVBOX_CONTAINER

ARVADOS_ROOT

The root directory of the Arvados source tree
default: $ARVBOX_DATA/arvados

ARVADOS_DEV_ROOT

The root directory of the Arvados-dev source tree
default: $ARVBOX_DATA/arvados-dev

ARVBOX_PUBLISH_IP

The IP address on which to publish services when running in public configuration. Overrides default detection of the host’s IP address.

Using Arvbox for Arvados development

The Arvbox section of Hacking Arvados has information about using Arvbox for Arvados development.

Making Arvbox accessible from other hosts

In “dev” and “localdemo” mode, Arvbox can only be accessed on the same host it is running. To publish Arvbox service ports to the host’s service ports and advertise the host’s IP address for services, use publicdev or publicdemo:

$ arvbox start publicdemo

This attempts to auto-detect the correct IP address to use by taking the IP address of the default route device. If the auto-detection is wrong, you want to publish a hostname instead of a raw address, or you need to access it through a different device (such as a router or firewall), set ARVBOX_PUBLISH_IP to the desire hostname or IP address.

$ export ARVBOX_PUBLISH_IP=example.com
$ arvbox start publicdemo

Note: this expects to bind the host’s port 80 (http) for workbench, so you cannot have a conflicting web server already running on the host. It does not attempt to take bind the host’s port 22 (ssh), as a result the arvbox ssh port is not published.

Notes

Services are designed to install and auto-configure on start or restart. For example, the service script for keepstore always compiles keepstore from source and registers the daemon with the API server.

Services are run with process supervision, so a service which exits will be restarted. Dependencies between services are handled by repeatedly trying and failing the service script until dependencies are fulfilled (by other service scripts) enabling the service script to complete.


Previous: Installation options Next: Single host Arvados

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.