Linux can report what compute resources are used by processes in a specific cgroup or Docker container. Crunch can use these reports to share that information with users running compute work. This can help workflow authors debug and optimize their workflows.
To enable cgroups accounting, you must boot Linux with the command line parameters cgroup_enable=memory swapaccount=1
.
Currently Arvados is not compatible with the new cgroups accounting, also known as cgroups v2. Currently, all supported GNU/Linux distributions don’t use cgroups v2 as default
If you are using a distribution in the compute nodes that ships with cgroups v2 enabled, make sure to disable it by booting Linux with the command line parameters systemd.unified_cgroup_hierarchy=0
.
After making changes, reboot the system to make these changes effective.
~$ sudo grubby --update-kernel=ALL --args='cgroup_enable=memory swapaccount=1 systemd.unified_cgroup_hierarchy=0'
Open the file /etc/default/grub
in an editor. Find where the string GRUB_CMDLINE_LINUX
is set. Add cgroup_enable=memory swapaccount=1 systemd.unified_cgroup_hierarchy=0
to that string. Save the file and exit the editor. Then run:
~$ sudo update-grub
Compute nodes must have Docker installed to run containers. This requires a relatively recent version of Linux (at least upstream version 3.10, or a distribution version with the appropriate patches backported). Follow the Docker Engine installation documentation for your distribution.
Make sure Docker is enabled to start on boot:
# systemctl enable --now docker
Depending on your anticipated workload or cluster configuration, you may need to tweak Docker options.
For information about how to set configuration options for the Docker daemon, see https://docs.docker.com/config/daemon/systemd/
Docker containers inherit ulimits from the Docker daemon. However, the ulimits for a single Unix daemon may not accommodate a long-running Crunch job. You may want to increase default limits for compute containers by passing --default-ulimit
options to the Docker daemon. For example, to allow containers to open 10,000 files, set --default-ulimit nofile=10000:10000
.
ValidationException: Not found: '/var/lib/cwl/workflow.json#main'
A possible configuration error is having Docker installed as a snap
package rather than a deb
package. This is a problem because snap
packages are partially containerized and may have a different view of the filesystem than crunch-run
. This will produce confusing problems, for example, directory bind mounts sent to Docker that are empty (instead of containing the intended files) and resulting in unexpected “file not found” errors.
To check for this situation, run snap list
and look for docker
. If found, run snap remove docker
and follow the instructions to above to install Docker Engine .
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.