Install Keep-balance

Keep-balance deletes unreferenced and overreplicated blocks from Keep servers, makes additional copies of underreplicated blocks, and moves blocks into optimal locations as needed (e.g., after adding new servers).

Note:

If you are installing keep-balance on an existing system with valuable data, you can run keep-balance in “dry run” mode first and review its logs as a precaution. To do this, edit your keep-balance startup script to use the flags -commit-pulls=false -commit-trash=false.

Install keep-balance

Keep-balance can be installed anywhere with network access to Keep services. Typically it runs on the same host as keepproxy.

A cluster should have only one keep-balance process running at a time.

On Debian-based systems:

~$ sudo apt-get install keep-balance

On Red Hat-based systems:

~$ sudo yum install keep-balance

Verify that keep-balance is functional:

~$ keep-balance -h
...
Usage: keep-balance [options]

Options:
  -commit-pulls
        send pull requests (make more replicas of blocks that are underreplicated or are not in optimal rendezvous probe order)
  -commit-trash
        send trash requests (delete unreferenced old blocks, and excess replicas of overreplicated blocks)
...

Create a keep-balance token

Create an Arvados superuser token for use by keep-balance.

On the API server, use the following commands:

~$ cd /var/www/arvados-api/current
$ sudo -u webserver-user RAILS_ENV=production bundle exec script/create_superuser_token.rb
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz

Update keepstore configuration files

On each node that runs keepstore, save the token you generated in the previous step in a text file like /etc/arvados/keepstore/system-auth-token.txt and then create or update /etc/arvados/keepstore/keepstore.yml with the following key:

SystemAuthTokenFile: /etc/arvados/keepstore/system-auth-token.txt

Restart all keepstore services to apply the updated configuration.

Create a keep-balance configuration file

On the host running keep-balance, create /etc/arvados/keep-balance/keep-balance.yml using the token you generated above. Follow this YAML format:

Listen: :9005
Client:
  APIHost: uuid_prefix.your.domain:443
  AuthToken: zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
KeepServiceTypes:
  - disk
ManagementToken: xyzzy
RunPeriod: 10m
CollectionBatchSize: 100000
CollectionBuffers: 1000
LostBlocksFile: /tmp/keep-balance-lost-blocks.txt    # If given, this file will be updated atomically during each successful run.

If your API server’s SSL certificate is not signed by a recognized CA, add the Insecure option to the Client section:

Client:
  Insecure: true
  APIHost: ...

Start the service (option 1: systemd)

If your system does not use systemd, skip this section and follow the runit instructions instead.

If your system uses systemd, the keep-balance service should already be set up. Start it and check its status:

~$ sudo systemctl restart keep-balance
~$ sudo systemctl status keep-balance
● keep-balance.service - Arvados Keep Balance
   Loaded: loaded (/lib/systemd/system/keep-balance.service; enabled)
   Active: active (running) since Sat 2017-02-14 18:46:01 UTC; 3 days ago
     Docs: https://doc.arvados.org/
 Main PID: 541 (keep-balance)
   CGroup: /system.slice/keep-balance.service
           └─541 /usr/bin/keep-balance -commit-pulls -commit-trash

Feb 14 18:46:01 zzzzz.arvadosapi.com keep-balance[541]: 2017/02/14 18:46:01 starting up: will scan every 10m0s and on SIGUSR1
Feb 14 18:56:01 zzzzz.arvadosapi.com keep-balance[541]: 2017/02/14 18:56:01 Run: start
Feb 14 18:56:01 zzzzz.arvadosapi.com keep-balance[541]: 2017/02/14 18:56:01 skipping zzzzz-bi6l4-rbtrws2jxul6i4t with service type "proxy"
Feb 14 18:56:01 zzzzz.arvadosapi.com keep-balance[541]: 2017/02/14 18:56:01 clearing existing trash lists, in case the new rendezvous order differs from previous run

Start the service (option 2: runit)

Install runit to supervise the keep-balance daemon.

On Debian-based systems:

~$ sudo apt-get install runit

On Red Hat-based systems:

~$ sudo yum install runit

Create a supervised service.

~$ sudo mkdir /etc/service/keep-balance
~$ cd /etc/service/keep-balance
~$ sudo mkdir log log/main
~$ printf '#!/bin/sh\nexec keep-balance -commit-pulls -commit-trash 2>&1\n' | sudo tee run
~$ printf '#!/bin/sh\nexec svlogd main\n' | sudo tee log/run
~$ sudo chmod +x run log/run
~$ sudo sv exit .
~$ cd -

Use sv stat and check the log file to verify the service is running.

~$ sudo sv stat /etc/service/keep-balance
run: /etc/service/keep-balance: (pid 12520) 2s; run: log: (pid 12519) 2s
~$ tail /etc/service/keep-balance/log/main/current
2017/02/14 18:46:01 starting up: will scan every 10m0s and on SIGUSR1
2017/02/14 18:56:01 Run: start
2017/02/14 18:56:01 skipping zzzzz-bi6l4-rbtrws2jxul6i4t with service type "proxy"
2017/02/14 18:56:01 clearing existing trash lists, in case the new rendezvous order differs from previous run

Enable delete operations on keepstore volumes

Ensure your keepstore services have the “delete” operation enabled. If it is disabled (which is the default), unneeded blocks will be identified by keep-balance, but will never be deleted from the underlying storage devices.

Add the -never-delete=false command line flag to your keepstore run script:

keepstore -never-delete=false -volume=...

Previous: Install Keep-web server Next: Install the Single Sign On (SSO) server

The content of this documentation is licensed under the Creative Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.