container_requests

API endpoint base: https://qr1hi.arvadosapi.com/arvados/v1/container_requests

Object type: xvhdp

Example UUID: zzzzz-xvhdp-0123456789abcde

Resource

A container request is a request for the Arvados cluster to perform some computational work. See computing with Crunch for details.

Each ContainerRequest offers the following attributes, in addition to the Common resource fields:

All attributes are optional, unless otherwise marked as required.

Attribute Type Description Notes
name string The name of the container_request.
description string The description of the container_request.
properties hash Client-defined structured data that does not affect how the container is run.
state string The allowed states are “Uncommitted”, “Committed”, and “Final”. Once a request is Committed, the only attributes that can be modified are priority, container_uuid, and container_count_max. A request in the “Final” state cannot have any of its functional parts modified (i.e., only name, description, and properties fields can be modified).
requesting_container_uuid string The uuid of the parent container that created this container_request, if any. Represents a process tree. The priority of this container_request is inherited from the parent container, if the parent container is cancelled, this container_request will be cancelled as well.
container_uuid string The uuid of the container that satisfies this container_request. The system may return a preexisting Container that matches the container request criteria. See Container reuse for more details. Container reuse is the default behavior, but may be disabled with use_existing: false to always create a new container.
container_count_max integer Maximum number of containers to start, i.e., the maximum number of “attempts” to be made.
mounts hash Objects to attach to the container’s filesystem and stdin/stdout. See Mount types for more details.
runtime_constraints hash Restrict the container’s access to compute resources and the outside world. Required when in “Committed” state. e.g.,
{
  "ram":12000000000,
  "vcpus":2,
  "API":true
}
See Runtime constraints for more details.
scheduling_parameters hash Parameters to be passed to the container scheduler when running this container. e.g.,
{
"partitions":["fastcpu","vfastcpu"]
}
See Scheduling parameters for more details.
container_image string Portable data hash of a collection containing the docker image to run the container. Required.
environment hash Environment variables and values that should be set in the container environment (docker run --env). This augments and (when conflicts exist) overrides environment variables given in the image’s Dockerfile.
cwd string Initial working directory, given as an absolute path (in the container) or a path relative to the WORKDIR given in the image’s Dockerfile. Required.
command array of strings Command to execute in the container. Required. e.g., ["echo","hello"]
output_path string Path to a directory or file inside the container that should be preserved as container’s output when it finishes. This path must be, or be inside, one of the mount targets. For best performance, point output_path to a writable collection mount. Also, see Pre-populate output using Mount points for details regarding optional output pre-population using mount points. Required.
priority integer Higher value means spend more resources on this container_request, i.e., go ahead of other queued containers, bring up more nodes etc. Priority 0 means a container should not be run on behalf of this request. Clients are expected to submit container requests with zero priority in order to preview the container that will be used to satisfy it. Priority can be null if and only if state!=“Committed”.
expires_at datetime After this time, priority is considered to be zero. Not yet implemented.
use_existing boolean If possible, use an existing (non-failed) container to satisfy the request instead of creating a new one. Default is true
log_uuid string Log collection containing log messages provided by the scheduler and crunch processes. Null if the container has not yet completed.
output_uuid string Output collection created when the container finished successfully. Null if the container has failed or not yet completed.
filters string Additional constraints for satisfying the container_request, given in the same form as the filters parameter accepted by the container_requests.list API.

Mount types

The “mounts” hash is the primary mechanism for adding data to the container at runtime (beyond what is already in the container image).

Each value of the “mounts” hash is itself a hash, whose “kind” key determines the handler used to attach data to the container.

Mount type Kind Description Examples
Arvados data collection collection "portable_data_hash" or "uuid" may be provided. If not provided, a new collection will be created. This is useful when "writable":true and the container’s output_path is (or is a subdirectory of) this mount target.
"writable" may be provided with a true or false to indicate the path must (or must not) be writable. If not specified, the system can choose.
"path" may be provided, and defaults to "/".
At container startup, the target path will have the same directory structure as the given path within the collection. Even if the files/directories are writable in the container, modifications will not be saved back to the original collections when the container ends.
{
 "kind":"collection",
 "uuid":"...",
 "path":"/foo.txt"
}
{
 "kind":"collection",
 "uuid":"..."
}
Git tree git_tree One of { "git-url", "repository_name", "uuid" } must be provided.
One of { "commit", "revisions" } must be provided.
“path” may be provided. The default path is “/”.
At container startup, the target path will have the source tree indicated by the given revision. The .git metadata directory will not be available: typically the system will use git-archive rather than git-checkout to prepare the target directory.
- If a value is given for "revisions", it will be resolved to a set of commits (as desribed in the “ranges” section of git-revisions(1)) and the container request will be satisfiable by any commit in that set.
- If a value is given for "commit", it will be resolved to a single commit, and the tree resulting from that commit will be used.
- "path" can be used to select a subdirectory or a single file from the tree indicated by the selected commit.
- Multiple commits can resolve to the same tree: for example, the file/directory given in "path" might not have changed between commits A and B.
- The resolved mount (found in the Container record) will have only the “kind” key and a “blob” or “tree” key indicating the 40-character hash of the git tree/blob used.
{
 "kind":"git_tree",
 "uuid":"zzzzz-s0uqq-xxxxxxxxxxxxxxx",
 "commit":"master"
}
{
 "kind":"git_tree",
 "uuid":"zzzzz-s0uqq-xxxxxxxxxxxxxxx",
 "commit_range":"bugfix^..master",
 "path":"/crunch_scripts/grep"
}
Temporary directory tmp "capacity": capacity (in bytes) of the storage device.
"device_type" (optional, default “network”): one of {"ram", "ssd", "disk", "network"} indicating the acceptable level of performance.
At container startup, the target path will be empty. When the container finishes, the content will be discarded. This will be backed by a storage mechanism no slower than the specified type.
{
 "kind":"tmp",
 "capacity":100000000000
}
{
 "kind":"tmp",
 "capacity":1000000000,
 "device_type":"ram"
}
Keep keep Expose all readable collections via arv-mount.
Requires suitable runtime constraints.
{
 "kind":"keep"
}
Mounted file or directory file "path": absolute path (inside the container) of a file or directory that is (or is inside) another mount target.
Can be used for “stdin” and “stdout” targets.
{
 "kind":"file",
 "path":"/mounted_tmp/a.out"
}
JSON document json A JSON-encoded string, array, or object.
{
“kind”:“json”,
“content”:{"foo":"bar"}
}

Pre-populate output using Mount points

When a container’s output_path is a tmp mount backed by local disk, this output directory can be pre-populated with content from existing collections. This content can be specified by mounting collections at mount points that are subdirectories of output_path. Certain restrictions apply:

1. Only mount points of kind collection are supported.

2. Mount points underneath output_path must not use "writable":true. If any of them are set as writable, the API will refuse to create/update the container request, and crunch-run will fail the container.

3. If any such mount points are configured as exclude_from_output":true, they will be excluded from the output.

If any process in the container tries to modify, remove, or rename these mount points or anything underneath them, the operation will fail and the container output and the underlying collections used to pre-populate are unaffected.

Example mount point configurations

All the below examples are based on this collection:


portable_data_hash cdfbe2e823222d26483d52e5089d553c+175

manifest_text: ./alice 03032680d3fa0561ef4f85071140861e+13+A04e9d06459cda00aa997565bd78001061cf5bffb@58ab593d 0:13:hello.txt\n./bob d820b9df970e1b498e7723c50b107e1b+11+A42d162a60210479d1cfaf9fbb98d494ac6322ae6@58ab593d 0:11:hello.txt\n./carol cf72b172ff969250ae14a893a6745440+13+A476a2fd39e14e9c03af3076bd17e3612c075ff66@58ab593d 0:13:hello.txt\n

Mount point Description Resulting collection manifest text
"mounts": {
  "/tmp/foo": {
    "kind": "collection",
    "portable_data_hash": "cdfbe2...+175"
  },
},
"output_path": "/tmp"
No path specified and hence the entire collection will be mounted. /foo/alice 030326… 0:13:hello.txt\n
./foo/bob d820b9… 0:11:hello.txt\n
./foo/carol cf72b1… 0:13:hello.txt\n
Note: Here the “.” in streams is replaced with foo.
"mounts": {
  "/tmp/foo/bar": {
    "kind": "collection",
    "portable_data_hash": "cdfbe2...+175"
    "path": "alice"
  },
},
"output_path": "/tmp"
Specified path refers to the subdirectory alice in the collection. /foo/bar 030326… 0:13:hello.txt\n
Note: only the manifest text segment for the subdirectory alice is included after replacing the subdirectory alice with foo/bar.
"mounts": {
  "/tmp/foo/bar": {
    "kind": "collection",
    "portable_data_hash": "cdfbe2...+175"
    "path": "alice/hello.txt"
  },
},
"output_path": "/tmp"
Specified path refers to the file hello.txt in the alice subdirectory /foo 030326… 0:13:bar\n
Note: Here the subdirectory alice is replaced with foo and the filename hello.txt from this subdirectory is replaced with bar.

Runtime constraints

Runtime constraints restrict the container’s access to compute resources and the outside world (in addition to its explicitly stated inputs and output).

Key Type Description Notes
ram integer Number of ram bytes to be used to run this process. Optional. However, a ContainerRequest that is in “Committed” state must provide this.
vcpus integer Number of cores to be used to run this process. Optional. However, a ContainerRequest that is in “Committed” state must provide this.
keep_cache_ram integer Number of keep cache bytes to be used to run this process. Optional.
API boolean When set, ARVADOS_API_HOST and ARVADOS_API_TOKEN will be set, and container will have networking enabled to access the Arvados API server. Optional.

Scheduling parameters

Parameters to be passed to the container scheduler (e.g., SLURM) when running a container.

Key Type Description Notes
partitions array of strings The names of one or more compute partitions that may run this container. If not provided, the system will choose where to run the container. Optional.

Container reuse

When a container request is “Committed”, the system will try to find and reuse an existing Container with the same command, cwd, environment, output_path, container_image, mounts, and runtime_constraints being requested. (Hashes in the serialized fields environment, mounts and runtime_constraints are compared without regard to key order.)

In order of preference, the system will use:

  • The first matching container to have finished successfully (i.e., reached state “Complete” with an exit_code of 0) whose log and output collections are still available.
  • The oldest matching “Running” container with the highest progress, i.e., the container that is most likely to finish first.
  • The oldest matching “Locked” container with the highest priority, i.e., the container that is most likely to start first.
  • The oldest matching “Queued” container with the highest priority, i.e,, the container that is most likely to start first.
  • A new container.

Canceling a container request

A container request may be canceled by setting its priority to 0, using an update call.

When a container request is canceled, it will still reflect the state of the Container it is associated with via the container_uuid attribute. If that Container is being reused by any other container_requests that are still active, i.e., not yet canceled, that Container may continue to run or be scheduled to run by the system in future. However, if no other container_requests are using that Contianer, then the Container will get canceled as well.

Methods

See Common resource methods for more information about create, delete, get, list, and update.

Required arguments are displayed in green.

create

Create a new container request.

Arguments:

Argument Type Description Location Example
container_request object Container request resource. request body

The request body must include the required attributes command, container_image, cwd, and output_path. It can also inlcude other attributes such as environment, mounts, and runtime_constraints.

delete

Delete an existing container request.

Arguments:

Argument Type Description Location Example
uuid string The UUID of the container request in question. path

get

Get a container request’s metadata by UUID.

Arguments:

Argument Type Description Location Example
uuid string The UUID of the container request in question. path

list

List container_requests.

See common resource list method.

See the create method documentation for more information about container request-specific filters.

update

Update attributes of an existing container request.

Arguments:

Argument Type Description Location Example
uuid string The UUID of the container request in question. path
container_request object query

Note:

Setting the priority of a committed container_request to 0 may cancel a running container assigned for it.
See Canceling a container request for further details.


Previous: repositories Next: containers

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.