containers

API endpoint base: https://pirca.arvadosapi.com/arvados/v1/containers

Object type: dz642

Example UUID: zzzzz-dz642-0123456789abcde

Resource

A container is work order to be dispatched to an Arvados cluster to perform some computational work. A container is created in response to a container request. See computing with Crunch for details.

Each Container offers the following attributes, in addition to the Common resource fields:

Attribute Type Description Notes
state string The allowed states are “Queued”, “Locked”, “Running”, “Cancelled” and “Complete”. See Container states for more details.
started_at datetime When this container started running. Null if container has not yet started.
finished_at datetime When this container finished. Null if container has not yet finished.
log string UUID or portable data hash of a collection containing the log messages produced when executing the container. PDH after the container is finished, otherwise UUID or null.
environment hash Environment variables and values that should be set in the container environment (docker run --env). This augments and (when conflicts exist) overrides environment variables given in the image’s Dockerfile. Must be equal to a ContainerRequest’s environment in order to satisfy the ContainerRequest.
cwd string Initial working directory. Must be equal to a ContainerRequest’s cwd in order to satisfy the ContainerRequest
command array of strings Command to execute. Must be equal to a ContainerRequest’s command in order to satisfy the ContainerRequest.
output_path string Path to a directory or file inside the container that should be preserved as this container’s output when it finishes. Must be equal to a ContainerRequest’s output_path in order to satisfy the ContainerRequest.
mounts hash Must contain the same keys as the ContainerRequest being satisfied. Each value must be within the range of values described in the ContainerRequest at the time the Container is assigned to the ContainerRequest. See Mount types for more details.
secret_mounts hash Must contain the same keys as the ContainerRequest being satisfied. Each value must be within the range of values described in the ContainerRequest at the time the Container is assigned to the ContainerRequest. Not returned in API responses. Reset to empty when state is “Complete” or “Cancelled”.
runtime_constraints hash Compute resources, and access to the outside world, that are / were available to the container.
Generally this will contain additional keys that are not present in any corresponding ContainerRequests: for example, even if no ContainerRequests specified constraints on the number of CPU cores, the number of cores actually used will be recorded here.
e.g.,
{
  "ram":12000000000,
  "vcpus":2,
  "API":true
}
See Runtime constraints for more details.
runtime_status hash Information related to the container’s run, including its steps. Some keys have specific meaning and are described later in this page. e.g.,
{
  "error": "This container won't be successful because at least one step has already failed."
}
See Runtime status for more details.
scheduling_parameters hash Parameters to be passed to the container scheduler when running this container. e.g.,
{
"partitions":["fastcpu","vfastcpu"]
}
See Scheduling parameters for more details.
output string Portable data hash of the output collection. Null if the container is not yet finished.
container_image string Portable data hash of a collection containing the docker image used to run the container.
progress number A number between 0.0 and 1.0 describing the fraction of work done.
priority integer Range 0-1000. Indicate scheduling order preference. Currently assigned by the system as the max() of the priorities of all associated ContainerRequests. See container request priority .
exit_code integer Process exit code. Null if state!=“Complete”
auth_uuid string UUID of a token to be passed into the container itself, used to access Keep-backed mounts, etc. Automatically assigned. Null if state∉{"Locked","Running"} or if runtime_token was provided.
locked_by_uuid string UUID of a token, indicating which dispatch process changed state to Locked. If null, any token can be used to lock. If not null, only the indicated token can modify this container. Null if state∉{"Locked","Running"}
runtime_token string A v2 token to be passed into the container itself, used to access Keep-backed mounts, etc. Not returned in API responses. Reset to null when state is “Complete” or “Cancelled”.
gateway_address string Address (host:port) of gateway server. Internal use only.
interactive_session_started boolean Indicates whether arvados-client shell has been used to run commands in the container, which may have altered the container’s behavior and output.

Container states

State Sgnificance Allowed next
Queued Waiting for a dispatcher to lock it and try to run the container. Locked, Cancelled
Locked A dispatcher has “taken” the container and is allocating resources for it. The container has not started yet. Queued, Running, Cancelled
Running Resources have been allocated and the contained process has been started (or is about to start). Crunch-run must set state to Running before there is any possibility that user code will run in the container. Complete, Cancelled
Complete Container was running, and the contained process/command has exited. Cancelled
Cancelled The container did not run long enough to produce an exit code. This includes cases where the container didn’t even start, cases where the container was interrupted/killed before it exited by itself (e.g., priority changed to 0), and cases where some problem prevented the system from capturing the contained process’s exit status (exit code and output). -

See Controlling container reuse for details about changing state from Complete to Cancelled

Mount types

The “mounts” hash is the primary mechanism for adding data to the container at runtime (beyond what is already in the container image).

Each value of the “mounts” hash is itself a hash, whose “kind” key determines the handler used to attach data to the container.

Mount type Kind Description Examples
Arvados data collection collection "portable_data_hash" or "uuid" may be provided. If not provided, a new collection will be created. This is useful when "writable":true and the container’s output_path is (or is a subdirectory of) this mount target.
"writable" may be provided with a true or false to indicate the path must (or must not) be writable. If not specified, the system can choose.
"path" may be provided, and defaults to "/".
At container startup, the target path will have the same directory structure as the given path within the collection. Even if the files/directories are writable in the container, modifications will not be saved back to the original collections when the container ends.
{
 "kind":"collection",
 "uuid":"...",
 "path":"/foo.txt"
}
{
 "kind":"collection",
 "uuid":"..."
}
Git tree git_tree "uuid" must be the UUID of an Arvados-hosted git repository.
"commit" must be a full 40-character commit hash.
"path", if provided, must be “/”.
At container startup, the target path will have the source tree indicated by the given commit. The .git metadata directory will not be available.
{
 "kind":"git_tree",
 "uuid":"zzzzz-s0uqq-xxxxxxxxxxxxxxx",
 "commit":"f315c59f90934cccae6381e72bba59d27ba42099"
}
Temporary directory tmp "capacity": capacity (in bytes) of the storage device.
"device_type" (optional, default “network”): one of {"ram", "ssd", "disk", "network"} indicating the acceptable level of performance. (note: not yet implemented as of v1.5)
At container startup, the target path will be empty. When the container finishes, the content will be discarded. This will be backed by a storage mechanism no slower than the specified type.
{
 "kind":"tmp",
 "capacity":100000000000
}
{
 "kind":"tmp",
 "capacity":1000000000,
 "device_type":"ram"
}
Keep keep Expose all readable collections via arv-mount.
Requires suitable runtime constraints.
{
 "kind":"keep"
}
Mounted file or directory file "path": absolute path (inside the container) of a file or directory that is (or is inside) another mount target.
Can be used for “stdin” and “stdout” targets.
{
 "kind":"file",
 "path":"/mounted_tmp/a.out"
}
JSON document json A JSON-encoded string, array, or object.
{
“kind”:“json”,
“content”:{"foo":"bar"}
}

Pre-populate output using Mount points

When a container’s output_path is a tmp mount backed by local disk, this output directory can be pre-populated with content from existing collections. This content can be specified by mounting collections at mount points that are subdirectories of output_path. Certain restrictions apply:

1. Only mount points of kind collection are supported.

2. Mount points underneath output_path which have "writable":true are copied into output_path during container initialization and may be updated, renamed, or deleted by the running container. The original collection is not modified. On container completion, files remaining in the output are saved to the output collection. The mount at output_path must be big enough to accommodate copies of the inner writable mounts.

3. If any such mount points are configured as exclude_from_output":true, they will be excluded from the output.

If any process in the container tries to modify, remove, or rename these mount points or anything underneath them, the operation will fail and the container output and the underlying collections used to pre-populate are unaffected.

Example mount point configurations

All the below examples are based on this collection:


portable_data_hash cdfbe2e823222d26483d52e5089d553c+175

manifest_text: ./alice 03032680d3fa0561ef4f85071140861e+13+A04e9d06459cda00aa997565bd78001061cf5bffb@58ab593d 0:13:hello.txt\n./bob d820b9df970e1b498e7723c50b107e1b+11+A42d162a60210479d1cfaf9fbb98d494ac6322ae6@58ab593d 0:11:hello.txt\n./carol cf72b172ff969250ae14a893a6745440+13+A476a2fd39e14e9c03af3076bd17e3612c075ff66@58ab593d 0:13:hello.txt\n

Mount point Description Resulting collection manifest text
"mounts": {
  "/tmp/foo": {
    "kind": "collection",
    "portable_data_hash": "cdfbe2...+175"
  },
},
"output_path": "/tmp"
No path specified and hence the entire collection will be mounted. /foo/alice 030326… 0:13:hello.txt\n
./foo/bob d820b9… 0:11:hello.txt\n
./foo/carol cf72b1… 0:13:hello.txt\n
Note: Here the “.” in streams is replaced with foo.
"mounts": {
  "/tmp/foo/bar": {
    "kind": "collection",
    "portable_data_hash": "cdfbe2...+175"
    "path": "alice"
  },
},
"output_path": "/tmp"
Specified path refers to the subdirectory alice in the collection. /foo/bar 030326… 0:13:hello.txt\n
Note: only the manifest text segment for the subdirectory alice is included after replacing the subdirectory alice with foo/bar.
"mounts": {
  "/tmp/foo/bar": {
    "kind": "collection",
    "portable_data_hash": "cdfbe2...+175"
    "path": "alice/hello.txt"
  },
},
"output_path": "/tmp"
Specified path refers to the file hello.txt in the alice subdirectory /foo 030326… 0:13:bar\n
Note: Here the subdirectory alice is replaced with foo and the filename hello.txt from this subdirectory is replaced with bar.

When a container’s output_path is a tmp mount backed by local disk, this output directory can contain symlinks to other files in the output directory, or to collection mount points. If the symlink leads to a collection mount, efficiently copy the collection into the output collection. Symlinks leading to files or directories are expanded and created as regular files in the output collection. Further, whether symlinks are relative or absolute, every symlink target (even targets that are symlinks themselves) must point to a path in either the output directory or a collection mount.

Runtime constraints

Runtime constraints restrict the container’s access to compute resources and the outside world (in addition to its explicitly stated inputs and output).

Key Type Description Notes
ram integer Number of ram bytes to be used to run this process. Optional. However, a ContainerRequest that is in “Committed” state must provide this.
vcpus integer Number of cores to be used to run this process. Optional. However, a ContainerRequest that is in “Committed” state must provide this.
keep_cache_ram integer Number of keep cache bytes to be used to run this process. Optional.
API boolean When set, ARVADOS_API_HOST and ARVADOS_API_TOKEN will be set, and container will have networking enabled to access the Arvados API server. Optional.

Runtime status

Runtime status provides container’s relevant information about its progress even while it’s still in Running state. This is used to avoid reusing containers that have not yet failed but will definitely do, and also for easier workflow debugging.

The following keys have well known meanings:

Key Type Description Notes
error string The existance of this key indicates the container will definitely fail, or has already failed. Optional.
warning string Indicates something unusual happened or is currently happening, but isn’t considered fatal. Optional.
activity string A message for the end user about what state the container is currently in. Optional.
errorDetails string Additional structured error details. Optional.
warningDetails string Additional structured warning details. Optional.

Scheduling parameters

Parameters to be passed to the container scheduler (e.g., SLURM) when running a container.

Key Type Description Notes
partitions array of strings The names of one or more compute partitions that may run this container. If not provided, the system will choose where to run the container. Optional.
preemptible boolean If true, the dispatcher will ask for a preemptible cloud node instance (eg: AWS Spot Instance) to run this container. Optional. Default is false.
max_run_time integer Maximum running time (in seconds) that this container will be allowed to run before being cancelled. Optional. Default is 0 (no limit).

Methods

See Common resource methods for more information about create, delete, get, list, and update.

Required arguments are displayed in green.

Supports federated get and list.

create

Create a new Container.

Arguments:

Argument Type Description Location Example
container object Container resource request body

delete

Delete a Container.

This API requires admin privileges. In normal operation, it should not be used at all. API clients like Workbench might not work correctly when a container request references a container that has been deleted.

Arguments:

Argument Type Description Location Example
uuid string The UUID of the Container in question. path

get

Get a Container’s metadata by UUID.

Arguments:

Argument Type Description Location Example
uuid string The UUID of the Container in question. path

list

List containers.

See common resource list method.

update

Update attributes of an existing Container.

Arguments:

Argument Type Description Location Example
uuid string The UUID of the Container in question. path
container object query

auth

Get the api_client_authorization record indicated by this container’s auth_uuid, which belongs to the container’s locked_by_uuid.

Argument Type Description Location Example
uuid string path

Previous: container_requests Next: workflows

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.