The Simple Storage Service (S3) API is a de-facto standard for object storage originally developed by Amazon Web Services. Arvados supports accessing files in Keep using the S3 API.
S3 is supported by many “cloud native” applications, and client libraries exist in many languages for programmatic access.
To access Arvados S3 using an S3 client library, you must tell it to use the URL of the keep-web server (this is
Services.WebDAVDownload.ExternalURL in the public configuration) as the custom endpoint. The keep-web server will decide to treat it as an S3 API request based on the presence of an AWS-format Authorization header. Requests without an Authorization header, or differently formatted Authorization, will be treated as WebDAV .
The “bucket name” is an Arvados collection uuid, portable data hash, or project uuid.
Path-style and virtual host-style requests are supported.
Services.WebDAVDownload.ExternalURL, with the bucket name in the first path segment:
Services.WebDAV.ExternalURL, with a bucket name in place of the leading
If you have wildcard DNS, TLS, and routing set up, an S3 client configured with endpoint
collections.example.com should work regardless of which request style it uses.
Supports the following request query parameters:
Can be used to create or replace a file in a collection.
An empty PUT with a trailing slash and
Content-Type: application/x-directory will create a directory within a collection if Arvados configuration option
Collections.S3FolderObjects is true.
Missing parent/intermediate directories within a collection are created automatically.
Cannot be used to create a collection or project.
Can be used to remove files from a collection.
If used on a directory marker, it will delete the directory only if the directory is empty.
Can be used to determine if a bucket exists and if client has read access to it.
Can be used to determine if an object exists and if client has read access to it.
Bucket versioning is presently not supported, so this will always respond that bucket versioning is not enabled.
GetObject, HeadObject, and HeadBucket return Arvados object properties as S3 metadata headers, e.g.,
If the requested path indicates a file or directory placeholder inside a collection, or the top level of a collection, GetObject and HeadObject return the collection properties.
If the requested path indicates a directory placeholder corresponding to a project, GetObject and HeadObject return the properties of the project.
HeadBucket returns the properties of the collection or project corresponding to the bucket name.
Non-string property values are returned in a JSON representation, e.g.,
As in Amazon S3, property values containing non-ASCII characters are returned in BASE64-encoded form as described in RFC 2047, e.g.,
GetBucketTagging and GetObjectTagging APIs are not supported.
It is not possible to modify collection or project properties using the S3 API.
Keep-web accepts AWS Signature Version 4 (AWS4-HMAC-SHA256) as well as the older V2 AWS signature.
If your client uses V4 signatures exclusively and your Arvados token was issued by the same cluster you are connecting to, you can use the Arvados token’s UUID part as your S3 Access Key, and its secret part as your S3 Secret Key. This is preferred, where applicable.
Example using cluster
In all other cases, replace every
/ character in your Arvados token with
_, and use the resulting string as both Access Key and Secret Key.
Example using a cluster other than
zzzzz or an S3 client that uses V2 signatures:
The content of this documentation is licensed under the
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the Apache License, Version 2.0.