API endpoint base: https://pirca.arvadosapi.com/arvados/v1/container_requests
Object type: xvhdp
Example UUID: zzzzz-xvhdp-0123456789abcde
A container request is a request for the Arvados cluster to perform some computational work. See computing with Crunch for details.
Each ContainerRequest offers the following attributes, in addition to the Common resource fields:
All attributes are optional, unless otherwise marked as required.
Attribute | Type | Description | Notes |
---|---|---|---|
name | string | The name of the container_request. | |
description | string | The description of the container_request. | |
properties | hash | User-defined metadata that does not affect how the container is run. May be used in queries using subproperty filters | |
state | string | The allowed states are “Uncommitted”, “Committed”, and “Final”. | Once a request is Committed, the only attributes that can be modified are priority, container_uuid, and container_count_max. A request in the “Final” state cannot have any of its functional parts modified (i.e., only name, description, and properties fields can be modified). |
requesting_container_uuid | string | The uuid of the parent container that created this container_request, if any. Represents a process tree. | The priority of this container_request is inherited from the parent container, if the parent container is cancelled, this container_request will be cancelled as well. |
container_uuid | string | The uuid of the container that satisfies this container_request. The system may return a preexisting Container that matches the container request criteria. See Container reuse for more details. | Container reuse is the default behavior, but may be disabled with use_existing: false to always create a new container. |
container_count_max | integer | Maximum number of containers to start, i.e., the maximum number of “attempts” to be made. | |
mounts | hash | Objects to attach to the container’s filesystem and stdin/stdout. | See Mount types for more details. |
secret_mounts | hash | Objects to attach to the container’s filesystem. Only “json” or “text” mount types allowed. | Not returned in API responses. Reset to empty when state is “Complete” or “Cancelled”. |
runtime_constraints | hash | Restrict the container’s access to compute resources and the outside world. | Required when in “Committed” state. e.g., See Runtime constraints for more details. |
scheduling_parameters | hash | Parameters to be passed to the container scheduler when running this container. | e.g., See Scheduling parameters for more details. |
container_image | string | Portable data hash of a collection containing the docker image to run the container. | Required. |
environment | hash | Environment variables and values that should be set in the container environment (docker run --env ). This augments and (when conflicts exist) overrides environment variables given in the image’s Dockerfile. |
|
cwd | string | Initial working directory, given as an absolute path (in the container) or a path relative to the WORKDIR given in the image’s Dockerfile. | Required. |
command | array of strings | Command to execute in the container. | Required. e.g., ["echo","hello"] |
output_path | string | Path to a directory or file inside the container that should be preserved as container’s output when it finishes. This path must be one of the mount targets. For best performance, point output_path to a writable collection mount. See Pre-populate output using Mount points for details regarding optional output pre-population using mount points and Symlinks in output for additional details. | Required. |
output_name | string | Desired name for the output collection. If null or empty, a name will be assigned automatically. | |
output_ttl | integer | Desired lifetime for the output collection, in seconds. If zero, the output collection will not be deleted automatically. | |
priority | integer | Range 0-1000. Indicate scheduling order preference. | Clients are expected to submit container requests with zero priority in order to preview the container that will be used to satisfy it. Priority can be null if and only if state!=“Committed”. See below for more details . |
expires_at | datetime | After this time, priority is considered to be zero. | Not yet implemented. |
use_existing | boolean | If possible, use an existing (non-failed) container to satisfy the request instead of creating a new one. | Default is true |
log_uuid | string | Log collection containing log messages provided by the scheduler and crunch processes. | Null if the container has not yet started running. To retrieve logs in real time while the container is running, use the log API (see below). |
output_uuid | string | Output collection created when the container finished successfully. | Null if the container has failed or not yet completed. |
filters | string | Additional constraints for satisfying the container_request, given in the same form as the filters parameter accepted by the container_requests.list API. | |
runtime_token | string | A v2 token to be passed into the container itself, used to access Keep-backed mounts, etc. | Not returned in API responses. Reset to null when state is “Complete” or “Cancelled”. |
runtime_user_uuid | string | The user permission that will be granted to this container. | |
runtime_auth_scopes | array of string | The scopes associated with the auth token used to run this container. | |
output_storage_classes | array of strings | The storage classes that will be used for the log and output collections of this container request | default is [“default”] |
output_properties | hash | User metadata properties to set on the output collection. The output collection will also have default properties “type” (“intermediate” or “output”) and “container_request” (the uuid of container request that produced the collection). | |
cumulative_cost | number | Estimated cost of the cloud VMs used to satisfy the request, including retried attempts and completed subrequests, but not including reused containers. | 0 if container was reused or VM price information was not available. |
A container request may be created in the Committed state, or created in the Uncommitted state and then moved into the Committed state.
Once a request is in the Committed state, Arvados locates a suitable existing container or schedules a new one. When the assigned container finishes, the request state changes to Final.
A client may cancel a committed request early (before the assigned container finishes) by setting the request priority to zero.
The priority
field has a range of 0-1000.
Priority 0 means no container should run on behalf of this request, and containers already running will be terminated (setting container priority to 0 is the cancel operation.)
Priority 1 is the lowest priority.
Priority 1000 is the highest priority.
The actual order that containers execute is determined by the underlying scheduling software (e.g. Slurm) and may be based on a combination of container priority, submission time, available resources, and other factors.
In the current implementation, the magnitude of difference in priority between two containers affects the weight of priority vs age in determining scheduling order. If two containers have only a small difference in priority (for example, 500 and 501) and the lower priority container has a longer queue time, the lower priority container may be scheduled before the higher priority container. Use a greater magnitude difference (for example, 500 and 600) to give higher weight to priority over queue time.
The “mounts” hash is the primary mechanism for adding data to the container at runtime (beyond what is already in the container image).
Each value of the “mounts” hash is itself a hash, whose “kind” key determines the handler used to attach data to the container.
Mount type | Kind | Description | Examples |
---|---|---|---|
Arvados data collection | collection |
"portable_data_hash" or "uuid" may be provided. If not provided, a new collection will be created. This is useful when "writable":true and the container’s output_path is (or is a subdirectory of) this mount target."writable" may be provided with a true or false to indicate the path must (or must not) be writable. If not specified, the system can choose."path" may be provided, and defaults to "/" .At container startup, the target path will have the same directory structure as the given path within the collection. Even if the files/directories are writable in the container, modifications will not be saved back to the original collections when the container ends. |
|
Git tree | git_tree |
"uuid" must be the UUID of an Arvados-hosted git repository."commit" must be a full 40-character commit hash."path" , if provided, must be “/”.At container startup, the target path will have the source tree indicated by the given commit. The .git metadata directory will not be available. |
|
Temporary directory | tmp |
"capacity" : capacity (in bytes) of the storage device."device_type" (optional, default “network”): one of {"ram", "ssd", "disk", "network"} indicating the acceptable level of performance. (note: not yet implemented as of v1.5)At container startup, the target path will be empty. When the container finishes, the content will be discarded. This will be backed by a storage mechanism no slower than the specified type. |
|
Keep | keep |
Expose all readable collections via arv-mount. Requires suitable runtime constraints. |
|
Mounted file or directory | file |
"path" : absolute path (inside the container) of a file or directory that is (or is inside) another mount target.Can be used for “stdin” and “stdout” targets. |
|
JSON document | json |
A JSON-encoded string, array, or object. | { |
When a container’s output_path is a tmp mount backed by local disk, this output directory can be pre-populated with content from existing collections. This content can be specified by mounting collections at mount points that are subdirectories of output_path. Certain restrictions apply:
1. Only mount points of kind collection
are supported.
2. Mount points underneath output_path which have "writable":true
are copied into output_path during container initialization and may be updated, renamed, or deleted by the running container. The original collection is not modified. On container completion, files remaining in the output are saved to the output collection. The mount at output_path must be big enough to accommodate copies of the inner writable mounts.
3. If any such mount points are configured as exclude_from_output":true
, they will be excluded from the output.
If any process in the container tries to modify, remove, or rename these mount points or anything underneath them, the operation will fail and the container output and the underlying collections used to pre-populate are unaffected.
All the below examples are based on this collection:
portable_data_hash cdfbe2e823222d26483d52e5089d553c+175
manifest_text: ./alice 03032680d3fa0561ef4f85071140861e+13+A04e9d06459cda00aa997565bd78001061cf5bffb@58ab593d 0:13:hello.txt\n./bob d820b9df970e1b498e7723c50b107e1b+11+A42d162a60210479d1cfaf9fbb98d494ac6322ae6@58ab593d 0:11:hello.txt\n./carol cf72b172ff969250ae14a893a6745440+13+A476a2fd39e14e9c03af3076bd17e3612c075ff66@58ab593d 0:13:hello.txt\n
Mount point | Description | Resulting collection manifest text |
|
No path specified and hence the entire collection will be mounted. | /foo/alice 030326… 0:13:hello.txt\n ./foo/bob d820b9… 0:11:hello.txt\n ./foo/carol cf72b1… 0:13:hello.txt\n Note: Here the “.” in streams is replaced with foo. |
|
Specified path refers to the subdirectory alice in the collection. | /foo/bar 030326… 0:13:hello.txt\n Note: only the manifest text segment for the subdirectory alice is included after replacing the subdirectory alice with foo/bar. |
|
Specified path refers to the file hello.txt in the alice subdirectory | /foo 030326… 0:13:bar\n Note: Here the subdirectory alice is replaced with foo and the filename hello.txt from this subdirectory is replaced with bar. |
When a container’s output_path is a tmp mount backed by local disk, this output directory can contain symlinks to other files in the output directory, or to collection mount points. If the symlink leads to a collection mount, efficiently copy the collection into the output collection. Symlinks leading to files or directories are expanded and created as regular files in the output collection. Further, whether symlinks are relative or absolute, every symlink target (even targets that are symlinks themselves) must point to a path in either the output directory or a collection mount.
Runtime constraints restrict the container’s access to compute resources and the outside world (in addition to its explicitly stated inputs and output).
Key | Type | Description | Notes |
---|---|---|---|
ram | integer | Number of ram bytes to be used to run this process. | Optional. However, a ContainerRequest that is in “Committed” state must provide this. |
vcpus | integer | Number of cores to be used to run this process. | Optional. However, a ContainerRequest that is in “Committed” state must provide this. |
keep_cache_disk | integer | When the container process accesses data from Keep via the filesystem, that data will be cached on disk, up to this amount in bytes. | Optional. If your cluster is configured to use a disk cache by default, the default size will match your ram constraint, bounded between 2GiB and 32GiB. |
keep_cache_ram | integer | When the container process accesses data from Keep via the filesystem, that data will be cached in memory, up to this amount in bytes. | Optional. If your cluster is configured to use a RAM cache by default, the administrator sets a default cache size. |
API | boolean | When set, ARVADOS_API_HOST and ARVADOS_API_TOKEN will be set, and container will have networking enabled to access the Arvados API server. | Optional. |
cuda | object | Request CUDA GPU support, see below | Optional. |
device_count | int | Number of GPUs to request. | Count greater than 0 enables CUDA GPU support. |
driver_version | string | Minimum CUDA driver version, in “X.Y” format. | Required when device_count > 0 |
hardware_capability | string | Minimum CUDA hardware capability, in “X.Y” format. | Required when device_count > 0 |
Parameters to be passed to the container scheduler (e.g., Slurm) when running a container.
Key | Type | Description | Notes |
---|---|---|---|
partitions | array of strings | The names of one or more compute partitions that may run this container. If not provided, the system will choose where to run the container. | Optional. |
preemptible | boolean | If true, the dispatcher should use a preemptible cloud node instance (eg: AWS Spot Instance) to run this container. Whether a preemptible instance is actually used depends on cluster configuration. | Optional. Default is false. |
max_run_time | integer | Maximum running time (in seconds) that this container will be allowed to run before being cancelled. | Optional. Default is 0 (no limit). |
When a container request is “Committed”, the system will try to find and reuse an existing Container with the same command, cwd, environment, output_path, container_image, mounts, secret_mounts, runtime_constraints, runtime_user_uuid, and runtime_auth_scopes being requested.
In order of preference, the system will use:
A container request may be canceled by setting its priority to 0, using an update call.
When a container request is canceled, it will still reflect the state of the Container it is associated with via the container_uuid attribute. If that Container is being reused by any other container_requests that are still active, i.e., not yet canceled, that Container may continue to run or be scheduled to run by the system in future. However, if no other container_requests are using that Container, then the Container will get canceled as well.
See Common resource methods for more information about create
, delete
, get
, list
, and update
.
Required arguments are displayed in green.
Supports federated create
, delete
, get
, list
, and update
.
Create a new container request.
Arguments:
Argument | Type | Description | Location | Example |
---|---|---|---|---|
container_request | object | Container request resource. | request body | |
cluster_id | string | The federated cluster to submit the container request. | query |
The request body must include the required attributes command, container_image, cwd, and output_path. It can also inlcude other attributes such as environment, mounts, and runtime_constraints.
Delete an existing container request.
Arguments:
Argument | Type | Description | Location | Example |
---|---|---|---|---|
uuid | string | The UUID of the container request in question. | path |
Get a container request’s metadata by UUID.
Arguments:
Argument | Type | Description | Location | Example |
---|---|---|---|---|
uuid | string | The UUID of the container request in question. | path |
List container requests.
See common resource list method.
The filters
argument can also filter on attributes of the container referenced by container_uuid
. For example, [["container.state", "=", "Running"]]
will match any container request whose container is running now.
Update attributes of an existing container request.
Arguments:
Argument | Type | Description | Location | Example |
---|---|---|---|---|
uuid | string | The UUID of the container request in question. | path | |
container_request | object | query |
Setting the priority of a committed container_request to 0 may cancel a running container assigned for it.
See Canceling a container request for further details.
Get container log data using WebDAV methods.
This API retrieves data from the container request’s log collection. It can be used at any time in the container request lifecycle.
Uncommitted
) it returns an empty directory.Queued
or Locked
, it returns an empty directory.Running
, .../log/{container_uuid}/
returns real-time logging data.Complete
or Cancelled
, .../log/{container_uuid}/
returns the final log collection.If a request results in multiple containers being run (see container_count_max
above), the logs from prior attempts remain available at .../log/{old_container_uuid}/
.
Currently, this API has a limitation that a directory listing at the top level /arvados/v1/container_requests/{uuid}/log/
does not reveal the per-container subdirectories. Instead, clients should look up the container request record and use the container_uuid
attribute to request files and directory listings under the per-container directory, as in the examples below.
This API supports the Range
request header, so it can be used to poll for and retrieve logs incrementally while the container is running.
Arguments:
Argument | Type | Description | Location | Example |
---|---|---|---|---|
method | string | Read-only WebDAV method | HTTP method | GET , OPTIONS , PROPFIND |
uuid | string | The UUID of the container request. | path | zzzzz-xvdhp-0123456789abcde |
path | string | Path to a file in the log collection. | path | /zzzzz-dz642-0123456789abcde/stderr.txt |
Examples:
GET /arvados/v1/container_requests/zzzzz-xvdhp-0123456789abcde/log/zzzzz-dz642-0123456789abcde/stderr.txt
PROPFIND /arvados/v1/container_requests/zzzzz-xvdhp-0123456789abcde/log/zzzzz-dz642-0123456789abcde/
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.