What you need to know and do in order to upgrade your Arvados installation.
apt-get upgrade
or yum upgrade
.Some versions introduce changes that require special attention when upgrading: e.g., there is a new service to install, or there is a change to the default configuration that you might need to override in order to preserve the old behavior.
This release includes several database migrations, which will be executed automatically as part of the API server upgrade. On large Arvados installations, these migrations will take a while. We’ve seen the upgrade take 30 minutes or more on installations with a lot of collections.
The arvados-controller
component now requires the /etc/arvados/config.yml file to be present. See the arvados-controller
installation instructions.
Support for the deprecated “jobs” API is broken in this release. Users who rely on it should not upgrade. This will be fixed in an upcoming 1.3.1 patch release, however users are encouraged to migrate as support for the “jobs” API will be dropped in an upcoming release. Users who are already using the “containers” API are not affected.
There are no special upgrade notes for this release.
It is recommended to regenerate the table statistics for Postgres after upgrading to v1.2.0. If autovacuum is enabled on your installation, this script would do the trick:
#!/bin/bash set -e set -u tables=`echo "\dt" | psql arvados_production | grep public|awk -e '{print $3}'` for t in $tables; do echo "echo 'analyze $t' | psql arvados_production" time echo "analyze $t" | psql arvados_production done
If you also need to do the vacuum, you could adapt the script to run ‘vacuum analyze’ instead of ‘analyze’.
Commit db5107dca adds a new system service, arvados-controller. More detail is available in story #13496.
To add the Arvados Controller to your system please refer to the installation instructions after upgrading your system to 1.2.0.
Verify your setup by confirming that API calls appear in the controller’s logs (e.g., journalctl -fu arvados-controller
) while loading a workbench page.
Secondary files missing from toplevel workflow inputs
This only affects workflows that rely on implicit discovery of secondaryFiles.
If a workflow input does not declare secondaryFiles
corresponding to the secondaryFiles
of workflow steps which use the input, the workflow would inconsistently succeed or fail depending on whether the input values were specified as local files or referenced an existing collection (and whether the existing collection contained the secondary files or not). To ensure consistent behavior, the workflow is now required to declare in the top level workflow inputs any secondaryFiles that are expected by workflow steps.
As an example, the following workflow will fail because the toplevel_input
does not declare the secondaryFiles
that are expected by step_input
:
class: Workflow cwlVersion: v1.0 inputs: toplevel_input: File outputs: [] steps: step1: in: step_input: toplevel_input out: [] run: id: sub class: CommandLineTool inputs: step_input: type: File secondaryFiles: - .idx outputs: [] baseCommand: echo
When run, this produces an error like this:
cwltool ERROR: [step step1] Cannot make job: Missing required secondary file 'hello.txt.idx' from file object: { "basename": "hello.txt", "class": "File", "location": "keep:ade9d0e032044bd7f58daaecc0d06bc6+51/hello.txt", "size": 0, "nameroot": "hello", "nameext": ".txt", "secondaryFiles": [] }
To fix this error, add the appropriate secondaryFiles
section to toplevel_input
class: Workflow
cwlVersion: v1.0
inputs:
toplevel_input:
type: File
secondaryFiles:
- .idx
outputs: []
steps:
step1:
in:
step_input: toplevel_input
out: []
run:
id: sub
class: CommandLineTool
inputs:
step_input:
type: File
secondaryFiles:
- .idx
outputs: []
baseCommand: echo
This bug has been fixed in Arvados release v1.2.0.
Secondary files on default file inputs
File
inputs that have default values and also expect secondaryFiles
and will fail to upload default secondaryFiles
. As an example, the following case will fail:
class: CommandLineTool inputs: step_input: type: File secondaryFiles: - .idx default: class: File location: hello.txt outputs: [] baseCommand: echo
When run, this produces an error like this:
2018-05-03 10:58:47 cwltool ERROR: Unhandled error, try again with --debug for more information: [Errno 2] File not found: u'hello.txt.idx'
To fix this, manually upload the primary and secondary files to keep and explicitly declare secondaryFiles
on the default primary file:
class: CommandLineTool
inputs:
step_input:
type: File
secondaryFiles:
- .idx
default:
class: File
location: keep:4d8a70b1e63b2aad6984e40e338e2373+69/hello.txt
secondaryFiles:
- class: File
location: keep:4d8a70b1e63b2aad6984e40e338e2373+69/hello.txt.idx
outputs: []
baseCommand: echo
This bug has been fixed in Arvados release v1.2.0.
There are no special upgrade notes for this release.
As part of story #11908, commit 8f987a9271 introduces a dependency on Postgres 9.4. Previously, Arvados required Postgres 9.3.
pg_dump
rh-postgresql94
backport package from either Software Collections: http://doc.arvados.org/install/install-postgresql.html or the Postgres developers: https://www.postgresql.org/download/linux/redhat/psql
There are no special upgrade notes for this release.
As part of story #12032, commit 68bdf4cbb1 introduces a dependency on Postgres 9.3. Previously, Arvados required Postgres 9.1.
pg_dump
rh-postgresql94
backport package from either Software Collections: http://doc.arvados.org/install/install-postgresql.html or the Postgres developers: https://www.postgresql.org/download/linux/redhat/psql
As part of story #11807, commit 55aafbb converts old “jobs” database records from YAML to JSON, making the upgrade process slower than usual.
As part of story #9005, commit cb230b0 reduces service discovery overhead in keep-web requests.
As part of story #11349, commit 2c094e2 adds a “management” http server to nodemanager.
[Manage](see example configuration files in source:services/nodemanager/doc or https://doc.arvados.org/install/install-nodemanager.html for more info)
address = 127.0.0.1
port = 8989
http://{address}:{port}/status.json
with a summary of how many nodes are in each state (booting, busy, shutdown, etc.)As part of story #10766, commit e8cc0d7 replaces puma with arvados-ws as the recommended websocket server.
Example, with systemd:
$ sudo sv down /etc/sv/puma
$ sudo rm -r /etc/sv/puma
$ systemctl disable puma
$ systemctl stop puma
As part of story #11168, commit 660a614 uses JSON instead of YAML to encode hashes and arrays in the database.
As part of story #10969, commit 74a9dec introduces a Docker image format compatibility check: the arv keep docker
command prevents users from inadvertently saving docker images that compute nodes won’t be able to run.
/etc/arvados/api/application.yml
): docker_image_formats: ["v1"]
docker_image_formats
in /var/www/arvados-api/current/config/application.default.yml
or source:services/api/config/application.default.yml or issue #10969 for more detail.Several Debian and RPM packages — keep-balance (d9eec0b), keep-web (3399e63), keepproxy (6de67b6), and arvados-git-httpd (9e27ddf) — now enable their respective components using systemd. These components prefer YAML configuration files over command line flags (3bbe1cd).
"sudo systemctl enable keep-web; sudo systemctl start keep-web"
."Sep 26 18:23:55 62751f5bb946 keep-web[74]: 2016/09/26 18:23:55 open /etc/arvados/keep-web/keep-web.yml: no such file or directory"
Commits ae72b172c8 and 3aae316c25 change the filesystem location where Python modules and scripts are installed.
/usr/local
(or the equivalent location in a Software Collection). Now they get installed to a path under /usr
. This improves compatibility with other Python packages provided by the distribution. See #9242 for more background.Commit eebcb5e requires the crunchrunner package to be installed on compute nodes and shell nodes in order to run CWL workflows.
sudo apt-get install crunchrunner
sudo yum install crunchrunner
Commit 3c88abd changes the Keep permission signature algorithm.
Commit e1276d6e disables Workbench’s “Getting Started” popup by default.
enable_getting_started_popup: true
in Workbench’s application.yml
configuration.Commit 5590c9ac makes a Keep-backed writable scratch directory available in crunch jobs (see #7751)
Commit 1e2ace5 changes recommended config for keep-web (see #5824)
-attachment-only-host download.uuid_prefix.arvadosapi.com
keep_web_download_url
Commit 1d1c6de removes stopped containers (see #7444)
docker run
default to --rm
. If you run arvados-docker-cleaner on a host that does anything other than run crunch-jobs, and you still want to be able to use docker start
, read the new doc page to learn how to turn this off before upgrading.Commit 21006cf adds a new keep-web service (see #5824).
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.