arvados-dispatch-lsf
is only relevant for on premises clusters that will spool jobs to LSF. Skip this section if you use Slurm or if you are installing a cloud cluster.
Containers can be dispatched to an LSF cluster. The dispatcher sends work to the cluster using LSF’s bsub
command, so it works in a variety of LSF configurations.
In order to run containers, you must choose a user that has permission to set up FUSE mounts and run Singularity/Docker containers on each compute node. This install guide refers to this user as the crunch
user. We recommend you create this user on each compute node with the same UID and GID, and add it to the fuse
and docker
system groups to grant it the necessary permissions. However, you can run the dispatcher under any account with sufficient permissions across the cluster.
Set up all of your compute nodes with Docker or Singularity.
Current limitations:
Arvados-dispatch-lsf reads the common configuration file at /etc/arvados/config.yml
.
Add a DispatchLSF entry to the Services section, using the hostname where arvados-dispatch-lsf
will run, and an available port:
Services:
DispatchLSF:
InternalURLs:
"http://hostname.zzzzz.arvadosapi.com:9007
": {}
Review the following configuration parameters and adjust as needed.
Each Arvados container that runs on your HPC cluster will bring up a long-lived connection to the Arvados controller and keep it open for the entire duration of the container. This connection is used to access real-time container logs from Workbench, and to enable the container shell feature.
Set the MaxGatewayTunnels
config entry high enough to accommodate the maximum number of containers you expect to run concurrently on your HPC cluster, plus incoming container shell sessions.
API: MaxGatewayTunnels: 2000
Also, configure Nginx (and any other HTTP proxies or load balancers running between the HPC and Arvados controller) to allow the expected number of connections, i.e., MaxConcurrentRequests + MaxQueuedRequests + MaxGatewayTunnels
.
arvados-dispatch-lsf uses sudo
to execute bsub
, for example sudo -E -u crunch bsub [...]
. This means the crunch
account must exist on the hosts where LSF jobs run (“execution hosts”), as well as on the host where you are installing the Arvados LSF dispatcher (the “submission host”). To use a user account other than crunch
, configure BsubSudoUser
:
Containers:
LSF:
BsubSudoUser: lsfuser
Alternatively, you can arrange for the arvados-dispatch-lsf process to run as an unprivileged user that has a corresponding account on all compute nodes, and disable the use of sudo
by specifying an empty string:
Containers:
LSF:
# Don't use sudo
BsubSudoUser: ""
When arvados-dispatch-lsf invokes bsub
, you can add arguments to the command by specifying BsubArgumentsList
. You can use this to send the jobs to specific cluster partitions or add resource requests. Set BsubArgumentsList
to an array of strings.
Template variables starting with % will be substituted as follows:
%U uuid
%C number of VCPUs
%M memory in MB
%T tmp in MB
%G number of GPU devices (runtime_constraints.cuda.device_count
)
%W maximum job run time in minutes, suitable for use with -W
or -We
flags (see MaxRunTimeOverhead MaxRunTimeDefault below)
Use %% to express a literal %. The %%J in the default will be changed to %J, which is interpreted by bsub
itself.
For example:
Containers:
LSF:
BsubArgumentsList: ["-o", "/tmp/crunch-run.%%J.out", "-e", "/tmp/crunch-run.%%J.err", "-J", "%U", "-n", "%C", "-D", "%MMB", "-R", "rusage[mem=%MMB:tmp=%TMB] span[hosts=1]", "-R", "select[mem>=%MMB]", "-R", "select[tmp>=%TMB]", "-R", "select[ncpus>=%C]", "-We", "%W"]
Note that the default value for BsubArgumentsList
uses the -o
and -e
arguments to write stdout/stderr data to files in /tmp
on the compute nodes, which is helpful for troubleshooting installation/configuration problems. Ensure you have something in place to delete old files from /tmp
, or adjust these arguments accordingly.
If the container requests access to GPUs (runtime_constraints.cuda.device_count
of the container request is greater than zero), the command line arguments in BsubCUDAArguments
will be added to the command line after BsubArgumentsList
. This should consist of the additional bsub
flags your site requires to schedule the job on a node with GPU support. Set BsubCUDAArguments
to an array of strings. For example:
Containers:
LSF:
BsubCUDAArguments: ["-gpu", "num=%G"]
Extra time to add to each container’s scheduling_parameters.max_run_time
value when substituting for %W
in BsubArgumentsList
, to account for time spent setting up the container image, copying output files, etc.
Default max_run_time
value to use for containers that do not specify one in scheduling_parameters.max_run_time
. If this is zero, and BsubArgumentsList
contains "-W", "%W"
or "-We", "%W"
, those arguments will be dropped when submitting containers that do not specify scheduling_parameters.max_run_time
.
arvados-dispatch-lsf polls the API server periodically for new containers to run. The PollInterval
option controls how often this poll happens. Set this to a string of numbers suffixed with one of the time units s
, m
, or h
. For example:
Containers:
PollInterval: 10s
Extra RAM to reserve (in bytes) on each LSF job submitted by Arvados, which is added to the amount specified in the container’s runtime_constraints
. If not provided, the default value is zero.
Supports suffixes KB
, KiB
, MB
, MiB
, GB
, GiB
, TB
, TiB
, PB
, PiB
, EB
, EiB
(where KB
is 103, KiB
is 210, MB
is 106, MiB
is 220 and so forth).
Containers:
ReserveExtraRAM: 256MiB
Older Linux kernels (prior to 3.18) have bugs in network namespace handling which can lead to compute node lockups. This by is indicated by blocked kernel tasks in “Workqueue: netns cleanup_net”. If you are experiencing this problem, as a workaround you can disable use of network namespaces by Docker across the cluster. Be aware this reduces container isolation, which may be a security risk.
Containers:
CrunchRunArgumentsList:
- "-container-enable-networking=always"
- "-container-network-mode=host"
LSF does not provide feedback when a submitted job’s RAM, CPU, or disk space constraints cannot be satisfied by any node: the job will wait in the queue indefinitely with “pending” status, reported by Arvados as “queued”.
As a workaround, you can configure InstanceTypes
with your LSF cluster’s compute node sizes. Arvados will use these sizes to determine when a container is impossible to run, and cancel it instead of submitting an LSF job.
Apart from detecting non-runnable containers, the configured instance types will not have any effect on scheduling.
InstanceTypes: most-ram: VCPUs: 8 RAM: 640GiB IncludedScratch: 640GB most-cpus: VCPUs: 32 RAM: 256GiB IncludedScratch: 640GB gpu: VCPUs: 8 RAM: 256GiB IncludedScratch: 640GB CUDA: DriverVersion: "11.4" HardwareCapability: "7.5" DeviceCount: 1
# dnf install arvados-dispatch-lsf
# apt install arvados-dispatch-lsf
# systemctl enable --now arvados-dispatch-lsf
# systemctl status arvados-dispatch-lsf
[...]
If systemctl status
indicates it is not running, use journalctl
to check logs for errors:
# journalctl -n12 --unit arvados-dispatch-lsf
Make sure the cluster config file is up to date on the API server host then restart the API server and controller processes to ensure the configuration changes are visible to the whole cluster.
# systemctl restart nginx arvados-controller
# arvados-server check
On the dispatch node, start monitoring the arvados-dispatch-lsf logs:
# journalctl -o cat -fu arvados-dispatch-lsf.service
In another terminal window, use the diagnostics tool to run a simple container.
# arvados-client sudo diagnostics
INFO 5: running health check (same as `arvados-server check`)
INFO 10: getting discovery document from https://zzzzz.arvadosapi.com/discovery/v1/apis/arvados/v1/rest
...
INFO 160: running a container
INFO ... container request submitted, waiting up to 10m for container to run
After performing a number of other quick tests, this will submit a new container request and wait for it to finish.
While the diagnostics tool is waiting, the arvados-dispatch-lsf
logs will show details about submitting an LSF job to run the container.
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.