arv subcommands

In order to use the arv command, make sure that you have a working environment.

arv create

arv create can be used to create Arvados objects from the command line. Arv create opens up the editor of your choice (set the EDITOR environment variable) and allows you to type or paste a json or yaml description. When saved the object will be created on the API server, if it passes validation.

$ arv create --help
Options:
  --project-uuid, -p <s>:   Project uuid in which to create the object
              --help, -h:   Show this message

arv get

arv get can be used to get a textual representation of Arvados objects from the command line. The output can be limited to a subset of the object’s fields. This command can be used with only the knowledge of an object’s UUID.

$ arv get --help
Usage: arv [--format json|yaml] get [uuid] [fields...]

Fetch the specified Arvados object, select the specified fields,
and print a text representation.

arv edit

arv edit can be used to edit Arvados objects from the command line. Arv edit opens up the editor of your choice (set the EDITOR environment variable) with the json or yaml description of the object. Saving the file will update the Arvados object on the API server, if it passes validation.

$ arv edit --help
Arvados command line client
Usage: arv edit [uuid] [fields...]

Fetch the specified Arvados object, select the specified fields,
open an interactive text editor on a text representation (json or
yaml, use --format) and then update the object.  Will use 'nano'
by default, customize with the EDITOR or VISUAL environment variable.

arv copy

arv copy can be used to copy a pipeline instance, template or collection from one Arvados instance to another. It takes care of copying the object and all its dependencies.

$ arv copy --help
usage: arv-copy [-h] [--version] [-v] [--progress] [--no-progress] [-f]
                [--src SOURCE_ARVADOS] [--dst DESTINATION_ARVADOS]
                [--recursive] [--no-recursive] [--project-uuid PROJECT_UUID]
                [--replication N] [--storage-classes STORAGE_CLASSES]
                [--varying-url-params VARYING_URL_PARAMS]
                [--prefer-cached-downloads] [--retries RETRIES]
                object_uuid

Copy a workflow, collection or project from one Arvados instance to another.
On success, the uuid of the copied object is printed to stdout.

positional arguments:
  object_uuid           The UUID of the object to be copied.

optional arguments:
  -h, --help            show this help message and exit
  --version             Print version and exit.
  -v, --verbose         Verbose output.
  --progress            Report progress on copying collections. (default)
  --no-progress         Do not report progress on copying collections.
  -f, --force           Perform copy even if the object appears to exist at
                        the remote destination.
  --src SOURCE_ARVADOS  Client configuration location for the source Arvados
                        cluster. May be either a configuration file path, or a
                        plain identifier like `foo` to search for a
                        configuration file `foo.conf` under a systemd or XDG
                        configuration directory. If not provided, will search
                        for a configuration file named after the cluster ID of
                        the source object UUID.
  --dst DESTINATION_ARVADOS
                        Client configuration location for the destination
                        Arvados cluster. May be either a configuration file
                        path, or a plain identifier like `foo` to search for a
                        configuration file `foo.conf` under a systemd or XDG
                        configuration directory. If not provided, will use the
                        default client configuration from the environment or
                        `settings.conf`.
  --recursive           Recursively copy any dependencies for this object, and
                        subprojects. (default)
  --no-recursive        Do not copy any dependencies or subprojects.
  --project-uuid PROJECT_UUID
                        The UUID of the project at the destination to which
                        the collection or workflow should be copied.
  --replication N
                        Number of replicas per storage class for the copied
                        collections at the destination. If not provided (or if
                        provided with invalid value), use the destination's
                        default replication-level setting (if found), or the
                        fallback value 2.
  --storage-classes STORAGE_CLASSES
                        Comma separated list of storage classes to be used
                        when saving data to the destinaton Arvados instance.
  --varying-url-params VARYING_URL_PARAMS
                        A comma separated list of URL query parameters that
                        should be ignored when storing HTTP URLs in Keep.
  --prefer-cached-downloads
                        If a HTTP URL is found in Keep, skip upstream URL
                        freshness check (will not notice if the upstream has
                        changed, but also not error if upstream is
                        unavailable).
  --retries RETRIES     Maximum number of times to retry server requests that
                        encounter temporary failures (e.g., server down).
                        Default 10.

arv tag

arv tag is used to tag Arvados objects.

$ arv tag --help

Usage:
arv tag add tag1 [tag2 ...] --object object_uuid1 [object_uuid2...]
arv tag remove tag1 [tag2 ...] --object object_uuid1 [object_uuid2...]
arv tag remove --all

  --dry-run, -n:   Don't actually do anything
  --verbose, -v:   Print some things on stderr
     --uuid, -u:   Return the UUIDs of the objects in the response, one per
                   line (default)
     --json, -j:   Return the entire response received from the API server, as
                   a JSON object
    --human, -h:   Return the response received from the API server, as a JSON
                   object with whitespace added for human consumption
   --pretty, -p:   Synonym of --human
     --yaml, -y:   Return the response received from the API server, in YAML
                   format
     --help, -e:   Show this message

arv ws

This is a frontend to arv-ws.

arv ws provides access to the websockets event stream.

$ arv ws --help
usage: arv-ws [-h] [-u UUID] [-f FILTERS]
              [--poll-interval POLL_INTERVAL | --no-poll]
              [-p PIPELINE | -j JOB]

optional arguments:
  -h, --help            show this help message and exit
  -u UUID, --uuid UUID  Filter events on object_uuid
  -f FILTERS, --filters FILTERS
                        Arvados query filter to apply to log events (JSON
                        encoded)
  --poll-interval POLL_INTERVAL
                        If websockets is not available, specify the polling
                        interval, default is every 15 seconds
  --no-poll             Do not poll if websockets are not available, just fail
  -p PIPELINE, --pipeline PIPELINE
                        Supply pipeline uuid, print log output from pipeline
                        and its jobs
  -j JOB, --job JOB     Supply job uuid, print log output from jobs

arv keep

arv keep commands for accessing the Keep storage service.

$ arv keep --help
Usage: arv keep [method] [--parameters]
Use 'arv keep [method] --help' to get more information about specific methods.

Available methods: ls, get, put, docker

arv keep ls

This is a frontend to arv-ls.

$ arv keep ls --help
usage: arv-ls [-h] [--retries RETRIES] [-s] locator

List contents of a manifest

positional arguments:
  locator            Collection UUID or locator

optional arguments:
  -h, --help         show this help message and exit
  --retries RETRIES  Maximum number of times to retry server requests that
                     encounter temporary failures (e.g., server down). Default
                     3.
  -s                 List file sizes, in KiB.

arv keep get

This is a frontend to arv-get.

$ arv keep get --help
usage: arv-get [-h] [--retries RETRIES] [--version]
               [--progress | --no-progress | --batch-progress]
               [--hash HASH | --md5sum] [-n] [-r]
               [-f | -v | --skip-existing | --strip-manifest] [--threads N]
               locator [destination]

Copy data from Keep to a local file or pipe.

positional arguments:
  locator            Collection locator, optionally with a file path or
                     prefix.
  destination        Local file or directory where the data is to be written.
                     Default: stdout.

optional arguments:
  -h, --help         show this help message and exit
  --retries RETRIES  Maximum number of times to retry server requests that
                     encounter temporary failures (e.g., server down).
                     Default 3.
  --version          Print version and exit.
  --progress         Display human-readable progress on stderr (bytes and, if
                     possible, percentage of total data size). This is the
                     default behavior when it is not expected to interfere
                     with the output: specifically, stderr is a tty _and_
                     either stdout is not a tty, or output is being written
                     to named files rather than stdout.
  --no-progress      Do not display human-readable progress on stderr.
  --batch-progress   Display machine-readable progress on stderr (bytes and,
                     if known, total data size).
  --hash HASH        Display the hash of each file as it is read from Keep,
                     using the given hash algorithm. Supported algorithms
                     include md5, sha1, sha224, sha256, sha384, and sha512.
  --md5sum           Display the MD5 hash of each file as it is read from
                     Keep.
  -n                 Do not write any data -- just read from Keep, and report
                     md5sums if requested.
  -r                 Retrieve all files in the specified collection/prefix.
                     This is the default behavior if the "locator" argument
                     ends with a forward slash.
  -f                 Overwrite existing files while writing. The default
                     behavior is to refuse to write *anything* if any of the
                     output files already exist. As a special case, -f is not
                     needed to write to stdout.
  -v                 Once for verbose mode, twice for debug mode.
  --skip-existing    Skip files that already exist. The default behavior is
                     to refuse to write *anything* if any files exist that
                     would have to be overwritten. This option causes even
                     devices, sockets, and fifos to be skipped.
  --strip-manifest   When getting a collection manifest, strip its access
                     tokens before writing it.
  --threads N        Set the number of download threads to be used. Take into
                     account that using lots of threads will increase the RAM
                     requirements. Default is to use 4 threads. On high
                     latency installations, using a greater number will
                     improve overall throughput.

arv keep put

This is a frontend to arv-put.

$ arv keep put --help
usage: arv-put [-h] [--max-manifest-depth N | --normalize]
               [--as-stream | --stream | --as-manifest | --in-manifest | --manifest | --as-raw | --raw]
               [--use-filename FILENAME] [--filename FILENAME]
               [--portable-data-hash] [--replication N]
               [--project-uuid UUID] [--name NAME]
               [--progress | --no-progress | --batch-progress]
               [--resume | --no-resume] [--retries RETRIES]
               [path [path ...]]

Copy data from the local filesystem to Keep.

positional arguments:
  path                  Local file or directory. Default: read from standard
                        input.

optional arguments:
  -h, --help            show this help message and exit
  --max-manifest-depth N
                        Maximum depth of directory tree to represent in the
                        manifest structure. A directory structure deeper than
                        this will be represented as a single stream in the
                        manifest. If N=0, the manifest will contain a single
                        stream. Default: -1 (unlimited), i.e., exactly one
                        manifest stream per filesystem directory that contains
                        files.
  --normalize           Normalize the manifest by re-ordering files and
                        streams after writing data.
  --as-stream           Synonym for --stream.
  --stream              Store the file content and display the resulting
                        manifest on stdout. Do not write the manifest to Keep
                        or save a Collection object in Arvados.
  --as-manifest         Synonym for --manifest.
  --in-manifest         Synonym for --manifest.
  --manifest            Store the file data and resulting manifest in Keep,
                        save a Collection object in Arvados, and display the
                        manifest locator (Collection uuid) on stdout. This is
                        the default behavior.
  --as-raw              Synonym for --raw.
  --raw                 Store the file content and display the data block
                        locators on stdout, separated by commas, with a
                        trailing newline. Do not store a manifest.
  --use-filename FILENAME
                        Synonym for --filename.
  --filename FILENAME   Use the given filename in the manifest, instead of the
                        name of the local file. This is useful when "-" or
                        "/dev/stdin" is given as an input file. It can be used
                        only if there is exactly one path given and it is not
                        a directory. Implies --manifest.
  --portable-data-hash  Print the portable data hash instead of the Arvados
                        UUID for the collection created by the upload.
  --replication N       Set the replication level for the new collection: how
                        many different physical storage devices (e.g., disks)
                        should have a copy of each data block. Default is to
                        use the server-provided default (if any) or 2.
  --project-uuid UUID   Store the collection in the specified project, instead
                        of your Home project.
  --name NAME           Save the collection with the specified name.
  --progress            Display human-readable progress on stderr (bytes and,
                        if possible, percentage of total data size). This is
                        the default behavior when stderr is a tty.
  --no-progress         Do not display human-readable progress on stderr, even
                        if stderr is a tty.
  --batch-progress      Display machine-readable progress on stderr (bytes
                        and, if known, total data size).
  --resume              Continue interrupted uploads from cached state
                        (default).
  --no-resume           Do not continue interrupted uploads from cached state.
  --retries RETRIES     Maximum number of times to retry server requests that
                        encounter temporary failures (e.g., server down).
                        Default 3.

Previous: arv reference Next: Installing the FUSE Driver