The Keepproxy server is a gateway into your Keep storage. Unlike the Keepstore servers, which are only accessible on the local LAN, Keepproxy is suitable for clients located elsewhere on the internet. Specifically, in contrast to Keepstore:
By convention, we use the following hostname for the Keepproxy server:
Hostname |
keep.uuid_prefix .your.domain |
This hostname should resolve from anywhere on the internet.
On Debian-based systems:
~$ sudo apt-get install keepproxy
On Red Hat-based systems:
~$ sudo yum install keepproxy
Verify that Keepproxy is functional:
~$ keepproxy -h
...
Usage: keepproxy [-config path/to/keepproxy.yml]
...
The Keepproxy server needs a token to talk to the API server. On the API server, use the following command to create the token.
Changewebserver-user
to the user that runs your web server process. If you install Phusion Passenger as we recommend, this is www-data
on Debian-based systems, and nginx
on Red Hat-based systems.
Using RVM:
apiserver:~$ cd /var/www/arvados-api/current
apiserver:/var/www/arvados-api/current$ sudo -u webserver-user RAILS_ENV=production `which rvm-exec` default bundle exec ./script/get_anonymous_user_token.rb --get
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
Not using RVM:
apiserver:~$ cd /var/www/arvados-api/current
apiserver:/var/www/arvados-api/current$ sudo -u webserver-user RAILS_ENV=production bundle exec ./script/get_anonymous_user_token.rb --get
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
Install runit to supervise the keepproxy daemon.
On Debian-based systems:
~$ sudo apt-get install runit
On Red Hat-based systems:
~$ sudo yum install runit
The run script for the keepproxy service should set the environment variables ARVADOS_API_TOKEN
(with the token you just generated), ARVADOS_API_HOST
, and, if needed, ARVADOS_API_HOST_INSECURE
. The core keepproxy command to run is:
ARVADOS_API_TOKEN=zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz ARVADOS_API_HOST=uuid_prefix.your.domain exec keepproxy
Because the Keepproxy is intended for access from anywhere on the internet, it is recommended to use SSL for transport encryption.
This is best achieved by putting a reverse proxy with SSL support in front of Keepproxy. Keepproxy itself runs on port 25107 by default; your reverse proxy can run on port 443 and pass requests to Keepproxy on port 25107.
upstream keepproxy { server 127.0.0.1:25107; } server { listen [your public IP address]:443 ssl; server_name keep.uuid_prefix.your.domain; proxy_connect_timeout 90s; proxy_read_timeout 300s; proxy_set_header X-Real-IP $remote_addr; proxy_http_version 1.1; proxy_request_buffering off; ssl on; ssl_certificate /etc/nginx/keep.uuid_prefix.your.domain-ssl.crt; ssl_certificate_key /etc/nginx/keep.uuid_prefix.your.domain-ssl.key; # Clients need to be able to upload blocks of data up to 64MiB in size. client_max_body_size 64m; location / { proxy_pass http://keepproxy; } }
Note: if the Web uploader is failing to upload data and there are no logs from keepproxy, be sure to check the nginx proxy logs. In addition to “GET” and “PUT”, The nginx proxy must pass “OPTIONS” requests to keepproxy, which should respond with appropriate Cross-origin resource sharing headers. If the CORS headers are not present, brower security policy will cause the upload request to silently fail. The CORS headers are generated by keepproxy and should not be set in nginx.
The API server needs to be informed about the presence of your Keepproxy server.
First, if you don’t already have an admin token, create a superuser token.
On the API server, use the following commands:
~$ cd /var/www/arvados-api/current
$ sudo -u webserver-user RAILS_ENV=production bundle exec script/create_superuser_token.rb
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
Configure your environment to run arv
using the output of create_superuser_token.rb:
export ARVADOS_API_HOST=zzzzz.example.com export ARVADOS_API_TOKEN=zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
~$ uuid_prefix=`arv --format=uuid user current | cut -d- -f1`
~$ echo "Site prefix is '$uuid_prefix'"
~$ read -rd $'\000' keepservice <<EOF; arv keep_service create --keep-service "$keepservice"
{
"service_host":"keep.$uuid_prefix.your.domain",
"service_port":443,
"service_ssl_flag":true,
"service_type":"proxy"
}
EOF
Log into a host that is on an external network from your private Arvados network. The host should be able to contact your keepproxy server (eg keep.$uuid_prefix.arvadosapi.com), but not your keepstore servers (eg keep[0-9].$uuid_prefix.arvadosapi.com).
Install the Python SDK
ARVADOS_API_HOST
and ARVADOS_API_TOKEN
must be set in the environment.
You should now be able to use arv-put
to upload collections and arv-get
to fetch collections, for an example see Testing keep. on the keepstore install page.
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.