Embodiments presented herein generally relate to storage management, and more specifically, to controlling object placement in an object store that uses consistent hashing-based techniques to store objects.
An object store is a data store that maintains an arbitrary number of objects (e.g., text files, audio/visual files, image files, and so on) and metadata associated with each object. Rather than manage data as files or blocks, the object store abstracts storage layers such that the data maintained by the object store can be exposed and managed as objects. Further, an object store can be distributed across multiple clustered storage nodes. Doing so generally provides the object store with scalability, high availability, and low latency.
Determining which node to store a given object is a known issue in managing object stores. One approach for controlling object placement is through consistent hashing. In this approach, the object store maps each node to an identifier using a secure hash function (e.g., SHA-1, MD5, etc.). The hash function is also used to generate a value that can uniquely identify each object. This results in a static mapping between objects and nodes.
The consistent hashing approach provides for uniform distribution of objects across nodes. Further, if nodes are disconnected from the cluster, consistent hashing allows the object store to re-map the objects to different nodes. Thus, attempts to disable the object store via attacks on an individual node are generally ineffective. In addition, consistent hashing allows for decentralized object lookups, which results in relatively fast and scalable object location.
However, consistent hashing limits user and process control over what node stores a given object. That is, because the mapping between objects and nodes is static, users (or processes) are generally unable to specify a location in which to store a given object. As another example, rename operation performance can be adversely affected. Because objects are typically copied when renamed, performing the hash function over the new object may result in the in the object being copied to a different node. As a result, the rename operation can impact performance on latency-sensitive processes (e.g., distributed workloads that perform intensive write and rename operations).