Keep is a content-addressable storage system that yields high performance for I/O-bound workloads. Keep is designed to run on low-cost commodity hardware or cloud services and is tightly integrated with the rest of the Arvados system. It provides high fault tolerance and high aggregate performance to a large number of clients.
collection
objects, which implement a rich permission model.Keep is a content-addressable file system. This means that files are managed using special unique identifiers derived from the contents of the file (specifically, the MD5 hash), rather than human-assigned file names. This has a number of advantages:
In Keep, information is stored in data blocks
. Data blocks are normally between 1 byte and 64 megabytes in size. If a file exceeds the maximum size of a single data block, the file will be split across multiple data blocks until the entire file can be stored. These data blocks may be stored and replicated across multiple disks, servers, or clusters. Each data block has its own identifier for the contents of that specific data block.
In order to reassemble the file, Keep stores a collection
manifest which lists in sequence the data blocks that make up the original file. A manifest
may store the information for multiple files, including a directory structure. See manifest format for more information on how manifests are structured.
The content of this documentation is licensed under the
Creative
Commons Attribution-Share Alike 3.0 United States licence.
Code samples in this documentation are licensed under the
Apache License, Version 2.0.