This tutorial assumes that you have access to Arvados command line tools, configured your API token, and confirmed a working environment.
Cost information is generally only available when Arvados runs in a cloud environment and arvados-dispatch-cloud is used to dispatch containers. The per node-hour price for each defined InstanceType must be supplied in config.yml.
The arv-cluster-activity program can be used to analyze cluster usage and cost over a time period.
The arv-cluster-activity tool can be installed from a distribution package or PyPI.
First, add the appropriate package repository for your distribution.
# dnf install python3-arvados-cluster-activity
# apt install python3-arvados-cluster-activity
Run pip install arvados-cluster-activity[prometheus] in an appropriate installation environment, such as a virtualenv.
Note:
Support for fetching Prometheus metrics depends on Pandas and NumPy. If these dependencies pose a problem, you can install the cluster activity tool without Prometheus support by omitting it from pip install.
The Cluster Activity report uses the Arvados Python SDK, which uses pycurl, which depends on the libcurl C library. To build the module you may have to first install additional packages. On Debian-based distributions you can install them by running:
# apt install git build-essential python3-dev libcurl4-openssl-dev libssl-dev
The arv-cluster-activity tool has a number of command line arguments:
~$ arv-cluster-activity --help
usage: arv-cluster-activity [-h] [--start START] [--end END] [--days DAYS] [--cost-report-file COST_REPORT_FILE] [--include-workflow-steps] [--columns COLUMNS] [--exclude EXCLUDE]
[--html-report-file HTML_REPORT_FILE] [--version] [--cluster CLUSTER] [--prometheus-auth PROMETHEUS_AUTH]
options:
-h, --help show this help message and exit
--start START Start date for the report in YYYY-MM-DD format (UTC) (or use --days)
--end END End date for the report in YYYY-MM-DD format (UTC), default "now"
--days DAYS Number of days before "end" to start the report (or use --start)
--cost-report-file COST_REPORT_FILE
Export cost report to specified CSV file
--include-workflow-steps
Include individual workflow steps (optional)
--columns COLUMNS Cost report columns (optional), must be comma separated with no spaces between column names. Available columns are:
Project, ProjectUUID, Workflow,
WorkflowUUID, Step, StepUUID, Sample, SampleUUID, User, UserUUID, Submitted, Started, Runtime, Cost
--exclude EXCLUDE Exclude workflows containing this substring (may be a regular expression)
--html-report-file HTML_REPORT_FILE
Export HTML report to specified file
--version Print version and exit.
--cluster CLUSTER Cluster to query for prometheus stats
--prometheus-auth PROMETHEUS_AUTH
Authorization file with prometheus info
To access the Arvados host, the tool will read default credentials from ~/.config/aravdos/settings.conf or use the standard ARVADOS_API_HOST and ARVADOS_API_TOKEN environment variables.
The cluster report tool will also fetch metrics from Prometheus, if available. This can be passed in an environment file using --prometheus-auth, or set as environment variables.
PROMETHEUS_HOST=https://your.prometheus.server.example.com PROMETHEUS_USER=admin PROMETHEUS_PASSWORD=password
PROMETHEUS_USER and PROMETHEUS_PASSWORD will be passed in an Authorization header using HTTP Basic authentication.
Alternately, instead of PROMETHEUS_USER and PROMETHEUS_PASSWORD you can provide PROMETHEUS_APIKEY. This will be passed in as a Bearer token (Authorization: Bearer <APIKEY>).
~$ arv-cluster-activity \
--days 90
--include-workflow-steps \
--prometheus-auth prometheus.env \
--cost-report-file report.csv \
--html-report-file report.html
INFO:root:Exporting workflow runs 0 - 5
INFO:root:Getting workflow steps
INFO:root:Got workflow steps 0 - 2
INFO:root:Getting container hours time series
INFO:root:Getting data usage time series

The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.