Set up a compute node with Docker

Note:

This page describes the requirements for a compute node in a Slurm or LSF cluster that will run containers dispatched by crunch-dispatch-slurm or arvados-dispatch-lsf. If you are installing a cloud cluster, refer to Build a cloud compute node image.

Note:

These instructions apply when Containers.RuntimeEngine is set to docker, refer to Set up a compute node with Singularity when running singularity.

  1. Introduction
  2. Set up Docker
  3. Update fuse.conf
  4. Update docker-cleaner.json
  5. Install’python-arvados-fuse and crunch-run and arvados-docker-cleaner

Introduction

This page describes how to configure a compute node so that it can be used to run containers dispatched by Arvados on a static cluster. These steps must be performed on every compute node.

Set up Docker

See Set up Docker

Install NVIDA CUDA Toolkit (optional)

If you want to use NVIDIA GPUs, install the CUDA toolkit.

In addition, you also must install the NVIDIA Container Toolkit:

DIST=$(. /etc/os-release; echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo apt-key add -
curl -s -L https://nvidia.github.io/libnvidia-container/$DIST/libnvidia-container.list | \
  sudo tee /etc/apt/sources.list.d/libnvidia-container.list
sudo apt-get update
apt-get install libnvidia-container1 libnvidia-container-tools nvidia-container-toolkit

Update fuse.conf

FUSE must be configured with the user_allow_other option enabled for Crunch to set up Keep mounts that are readable by containers. Install this file as /etc/fuse.conf:

# Allow non-root users to specify the 'allow_other' or 'allow_root'
# mount options.
user_allow_other

Update docker-cleaner.json

The arvados-docker-cleaner program removes least recently used Docker images as needed to keep disk usage below a configured limit.

Create a file /etc/arvados/docker-cleaner/docker-cleaner.json in an editor, with the following contents.

{
    "Quota": "10G",
    "RemoveStoppedContainers": "always"
}

Choosing a quota: Most deployments will want a quota that’s at least 10G. From there, a larger quota can help reduce compute overhead by preventing reloading the same Docker image repeatedly, but will leave less space for other files on the same storage (usually Docker volumes). Make sure the quota is less than the total space available for Docker images.

Note:

This also removes all containers as soon as they exit, as if they were run with docker run --rm. If you need to debug or inspect containers after they stop, temporarily stop arvados-docker-cleaner or configure it with "RemoveStoppedContainers":"never".

Install python-arvados-fuse and crunch-run and arvados-docker-cleaner

Alma/CentOS/Red Hat/Rocky

# dnf install python-arvados-fuse crunch-run arvados-docker-cleaner

Debian and Ubuntu

# apt-get install python-arvados-fuse crunch-run arvados-docker-cleaner

Start the service

# systemctl enable --now arvados-docker-cleaner
# systemctl status arvados-docker-cleaner
[...]

If systemctl status indicates it is not running, use journalctl to check logs for errors:

# journalctl -n12 --unit arvados-docker-cleaner

Previous: Install the cloud dispatcher Next: Set up a compute node with Singularity

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.