Choosing which components to install

Arvados consists of many components, some of which may be omitted (at the cost of reduced functionality.) It may also be helpful to review the Arvados Architecture to understand how these components interact.

Core
Postgres database Stores data for the API server. Required.
API server Core Arvados logic for managing users, groups, collections, containers, and enforcing permissions. Required.
Keep (storage)
Keepstore Stores content-addressed blocks in a variety of backends (local filesystem, cloud object storage). Required.
Keepproxy Gateway service to access keep servers from external networks. Required to be able to use arv-put, arv-get, or arv-mount outside the private Arvados network.
Keep-web Gateway service providing read/write HTTP and WebDAV support on top of Keep. Required to be able to download files from Keep over plain HTTP in Workbench.
Keep-balance Storage cluster maintenance daemon responsible for moving blocks to their optimal server location, adjusting block replication levels, and trashing unreferenced blocks. Required to free deleted data from underlying storage, and to ensure proper replication and block distribution (including support for storage classes).
User interface
Single Sign On server Login server. Required for web based login to Workbench.
Workbench Primary graphical user interface for working with file collections and running containers. Optional. Depends on API server, SSO server, keep-web, websockets server.
Workflow Composer Graphical user interface for editing Common Workflow Language workflows. Optional. Depends on git server (arv-git-httpd).
Additional services
Websockets server Event distribution server. Required to view streaming container logs in Workbench.
Shell server Synchronize (create/delete/configure) Unix shell accounts with Arvados users. Optional.
Git server Arvados-hosted git repositories, with Arvados-token based authentication. Optional, but required by Workflow Composer.
Crunch (running containers)
crunch-dispatch-slurm Run analysis workflows using Docker containers distributed across a SLURM cluster. Optional if you wish to use Arvados for data management only.
Node Manager Allocate and free cloud VM instances on demand based on workload. Optional, not needed for a static SLURM cluster (such as on-premise HPC).

Previous: Prerequisites Next: Set up PostgreSQL databases

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.