Keepstore can store data in object storage compatible with the S3 API, such as Amazon S3, Google Cloud Storage, or Ceph RADOS.
Volumes are configured in the Volumes
section of the cluster configuration file.
Note that each volume has a UUID, like zzzzz-nyw5e-0123456789abcde
. You assign these manually: replace zzzzz
with your Cluster ID, and replace 0123456789abcde
with an arbitrary unique string of 15 alphanumerics. Once assigned, UUIDs should not be changed.
Essential configuration values are highlighted in red. Remaining parameters are provided for documentation, with their default values.
Volumes:
ClusterID-nyw5e-000000000000000:
AccessViaHosts:
# This section determines which keepstore servers access the
# volume. In this example, keep0 has read/write access, and
# keep1 has read-only access.
#
# If the AccessViaHosts section is empty or omitted, all
# keepstore servers will have read/write access to the
# volume.
"http://keep0.ClusterID.example.com:25107": {}
"http://keep1.ClusterID.example.com:25107": {ReadOnly: true}
Driver: S3
DriverParameters:
# Bucket name.
Bucket: example-bucket-name
# IAM role name to use when retrieving credentials from
# instance metadata. It can be omitted, in which case the
# role name itself will be retrieved from instance metadata
# -- but setting it explicitly may protect you from using
# the wrong credentials in the event of an
# installation/configuration error.
IAMRole: ""
# If you are not using an IAM role for authentication,
# specify access credentials here instead.
AccessKey: ""
SecretKey: ""
# Storage provider region. For Google Cloud Storage, use ""
# or omit.
Region: us-east-1a
# Storage provider endpoint. For Amazon S3, use "" or
# omit. For Google Cloud Storage, use
# "https://storage.googleapis.com".
Endpoint: ""
# Change to true if the region requires a LocationConstraint
# declaration.
LocationConstraint: false
# Requested page size for "list bucket contents" requests.
IndexPageSize: 1000
# Maximum time to wait while making the initial connection
# to the backend before failing the request.
ConnectTimeout: 1m
# Maximum time to wait for a complete response from the
# backend before failing the request.
ReadTimeout: 2m
# Maximum eventual consistency latency
RaceWindow: 24h
# How much replication is provided by the underlying bucket.
# This is used to inform replication decisions at the Keep
# layer.
Replication: 2
# If true, do not accept write or trash operations, even if
# AccessViaHosts.*.ReadOnly is false.
#
# If false or omitted, enable write access (subject to
# AccessViaHosts.*.ReadOnly, where applicable).
ReadOnly: false
# Storage classes to associate with this volume. See "Storage
# classes" in the "Admin" section of doc.arvados.org.
StorageClasses: null
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.