Arvados 3.2.0 introduces service containers. If your container runs a web service, you can have Arvados expose that service at a dynamically assigned hostname+port combination. After you run a workflow in Arvados, you can run interactive data analysis and visualization tools as service containers to explore the results.
Services can have different access levels:
Unfortunately, not every interactive container can be run as an Arvados service container. Here are some limitations you should be aware of before you start. These limitations may be lifted in a future release of Arvados.
Service containers require the Arvados administrator to configure the hostname(s) and port(s) to use for service containers. If you’re not sure whether your Arvados cluster supports service containers, check with your administrator.
Arvados can only expose services that use plain HTTP in the container. Other protocols are currently not supported. For example, Jupyter notebooks require websockets to retrieve the results of running code. If you try to run a Jupyter notebook as an Arvados service container, you’ll be able to connect and start a notebook, but you won’t be able to run any code inside of it.
If you run a service container as part of a larger workflow, the workflow supervisor will wait for the service container to finish before it considers the workflow finished. This is okay if the service reports progress or provides debugging for a main process that takes a limited time, but you probably don’t want this if your service container waits for user input to quit or runs indefinitely. To run dedicated interactive services, consider starting them separately from your main workflow, and launching them with arvados-cwl-runner --local
to avoid the overhead of a separate supervisor container.
This page demonstrates different ways to launch and access a service container using the nginx web server as an example. This page assumes you are already familiar with basic data and workflow management in Arvados.
Users who want to start service containers on demand can do so using CWL. If you’re writing automation for Arvados, you can also submit service container requests directly through the Arvados API.
This CWL runs nginx, exposes port 80 through Arvados, and takes one input with the directory of files to serve. Save this content as nginx.cwl
:
#!/usr/bin/env cwl-runner # nginx.cwl cwlVersion: v1.2 $namespaces: arv: "http://arvados.org/cwl#" cwltool: "http://commonwl.org/cwltool#" class: CommandLineTool inputs: siteFiles: type: Directory requirements: DockerRequirement: dockerPull: library/nginx:1.29 InitialWorkDirRequirement: listing: - entry: "$(inputs.siteFiles)" entryname: /usr/share/nginx/html hints: ResourceRequirement: # You may adjust these values as desired. coresMin: 2 coresMax: 8 ramMin: 128 ramMax: 1024 arv:PublishPorts: publishPorts: "80": serviceAccess: public label: Web Server baseCommand: nginx arguments: - "-g" - "daemon off;" outputs: {}
Next, you’ll need a directory on your system that contains some HTML and supporting content you want to serve from the container. You can use any web content you like. If you don’t have anything handy, create a new directory and save this web page as index.html
inside it:
<!-- Example index.html --> <!doctype html> <html lang=en> <head> <meta charset=utf-8> <title>Test Page</title> </head> <body> <p>Hello from inside an Arvados service workflow!</p> </body> </html>
Write an input file nginx-in.yml
that points siteFiles
to the directory with your HTML content:
# nginx-in.yml
siteFiles:
class: Directory
path: "/home/you/arvados-nginx-site"
Launch this CWL with your input using arvados-cwl-runner
:
$ arvados-cwl-runner --local nginx.cwl nginx-in.yml
The tool will upload all the dependencies to Arvados, then report:
INFO [container nginx.cwl] zzzzz-xvhdp-abcde12345fghij state is Committed
After Arvados launches the container, you’ll be able to connect to your web server.
If you are comfortable writing your own Arvados tools, you can submit service container requests directly to the Arvados API. Before you start, you must make sure the container image and data you want to use is already in Arvados.
For this example, save the official nginx Docker image to Arvados using arv-keepdocker
, then get the portable data hash of that container:
$ arv-keepdocker library/nginx 1.29
[…]
zzzzz-4zz18-wyfcjuvi7ankgp2
$ arv collection get --select='["portable_data_hash"]' --uuid=zzzzz-4zz18-wyfcjuvi7ankgp2
{
"etag":"",
"kind":"arvados#collection",
"portable_data_hash":"23a275c1761b645edb84355a91702cee+219"
}
Next, create a collection that contains some HTML and supporting content you want to serve from the container. You can use any web content you like. If you don’t have anything handy, create a new directory and save this web page as index.html
inside it:
<!-- Example index.html --> <!doctype html> <html lang=en> <head> <meta charset=utf-8> <title>Test Page</title> </head> <body> <p>Hello from inside an Arvados service container!</p> </body> </html>
Create a collection from your content directory:
$ arv-put /home/you/arvados-nginx-site
[…]
zzzzz-4zz18-5jh66sx5f2xo6kd
Now you are ready to submit your container request. Below is a body you could use with either the command-line tool arv container_request create --container-request=…
or an SDK like:
arv_client.container_requests().create(
body={'container_request': ...}
).execute()
container_image
is the portable data hash of the nginx Docker image you uploaded.mounts
value for /usr/share/nginx/html
, the value of uuid
is the UUID of your site content collection. You could alternatively specify a portable_data_hash
.API
must be set true
for the container to be accessible over the network. You can adjust the other runtime constraints as you like.command
, cwd
, and output_path
are based on the Docker image we’re using.state
on, you can modify these fields as desired, or add others like owner_uuid
.{
"container_image": "23a275c1761b645edb84355a91702cee+219",
"service": true,
"use_existing": false,
"published_ports": {
"80": {
"access": "public",
"label": "Web Server",
"initial_path": ""
}
},
"mounts": {
"/usr/share/nginx/html": {
"kind": "collection",
"uuid": "zzzzz-4zz18-5jh66sx5f2xo6kd"
},
"/run/nginx.out": {
"kind": "collection",
"writable": true
}
},
"runtime_constraints": {
"API": true,
"ram": 209715200,
"vcpus": 2
},
"command": [
"nginx",
"-g",
"daemon off;"
],
"cwd": ".",
"output_path": "/run/nginx.out",
"state": "Committed",
"name": "nginx server",
"priority": 500
}
After Arvados starts a service container, the container record includes the URL users should use to access the service(s). You can open the service URL through Workbench or retrieve it from the Arvados API.
When you view the process page for a running service container, a blue button appears next to the process name to connect to that service. If the container runs multiple services, you’ll be able to select the one you want to connect to from a pulldown.
This section will illustrate how to get this information from Arvados API records using the CLI tools. You can follow this same process to make analogous API calls with any SDK. After you submit your container request, get its corresponding container_uuid
:
$ arv container_request get --select='["container_uuid","state"]' --uuid=zzzzz-xvhdp-abcde12345fghij
{
"container_uuid":"zzzzz-dz642-y87pdppv4afp4da",
"etag":"",
"kind":"arvados#containerRequest",
"state":"Committed"
}
If state
is Committed
but container_uuid
is null
, then your request has not been dispatched yet. Wait a little bit and try again. Once you have a container_uuid
, request the published_ports
field of that container record:
$ arv container get --select='["published_ports"]' --uuid=zzzzz-dz642-y87pdppv4afp4da
{
"published_ports":{
"80":{
"label":"Web Server",
"access":"public",
"base_url":"https://zzzzz.arvados.example:8900/",
"initial_url":"https://zzzzz.arvados.example:8900/",
"initial_path":"",
"external_port":8900
}
}
}
For each service, the initial_url
provides the URL where you can connect to the corresponding service over HTTP. If access
is private
, you should add an arvados_api_token
query parameter for a user who requested this container. For example, if the container’s published port above had "access":"private"
, the full URL to connect to it would look like:
https://zzzzz.arvados.example:8900/?arvados_api_token=v2/zzzzz-gj3su-y2tncmjag9gajm8/1234567890
Cancel a service container just like any other running container. You can use the red ⏹ Cancel button that appears next to the process name in Workbench or the ⏹ Cancel button in the action toolbar:
Or if you’re using the Arvados API directly, update your container request to set its priority
to 0:
$ arv container_request update --container-request='{"priority":0}' --uuid=zzzzz-xvhdp-abcde12345fghij
Services.ContainerWebServices
in the configuration reference
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.