Crunch scripts

Note:

This is a legacy API. This endpoint is deprecated, disabled by default in new installations, and slated to be removed entirely in a future major release of Arvados. It is replaced by container requests.

Crunch scripts

A crunch script is responsible for completing a single JobTask. In doing so, it will:

  • (optionally) read some input from Keep
  • (optionally) store some output in Keep
  • (optionally) create some new JobTasks and add them to the current Job
  • (optionally) update the current JobTask record with the “output” attribute set to a Keep locator or a fragment of a manifest
  • update the current JobTask record with the “success” attribute set to True

A task’s context is provided in environment variables.

Environment variable Description
JOB_UUID UUID of the current Job
TASK_UUID UUID of the current JobTask
ARVADOS_API_HOST Hostname and port number of API server
ARVADOS_API_TOKEN Authentication token to use with API calls made by the current task

The crunch script typically uses the Python SDK (or another suitable client library / SDK) to connect to the Arvados service and retrieve the rest of the details about the current job and task.

The Python SDK has some shortcuts for common operations.

In general, a crunch script can access information about the current job and task like this:

import arvados
import os

job = arvados.api().jobs().get(uuid=os.environ['JOB_UUID']).execute()
$sys.stderr.write("script_parameters['foo'] == %s"
                  % job['script_parameters']['foo'])

task = arvados.api().job_tasks().get(uuid=os.environ['TASK_UUID']).execute()
$sys.stderr.write("current task sequence number is %d"
                  % task['sequence'])

Previous: cloud dispatcher Next: jobs

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.