Many modern applications involve the analysis of large amounts of data. For a number of reasons including performance, availability and durability, the data may often be distributed among numerous storage servers or devices of a distributed system. In some cases a network-accessible storage service of a provider network may be employed, with the data potentially spread across multiple storage devices at one or more data centers. The particular storage server at which a data object is to be stored for a write request, or from which the data object is to retrieved for a read request, may be selected for some applications by applying a hash function to one or more keys associated with the data object. Such applications may sometimes to be referred to as distributed hashing applications or distributed hash table applications.
In one variant of distributed hashing, a compound key may be employed, with one or more attribute values being used as a hash key to identify a particular subset of data, and one or more other attribute values being used as list keys or sort keys within the subset. For example, in an order management application associated with an inventory, each hash key may correspond to a different customer, while different orders submitted by a given customer may correspond to respective list keys associated with the customer's unique hash key. Common operations in such an order management application may include, for example, inserting new orders, canceling orders, retrieving data objects in list key order, and so on, which may typically require the use of some type of indexing for high performance.
Some large-scale distributed hashing-based applications may be implemented using non-relational data stores as the back-end. Such data stores may provide very high throughput rates (and low latencies) for individual reads and writes, but may not support at least some of the ACID (atomicity, consistency, isolation and durability) properties associated with the relational model. For example, some non-relational data stores may not implement consistent transactions involving multiple writes and/or may not support locking, at least in the way that relational databases usually do. Providing index-based operations that are both consistent and low-latency may present a challenge, especially in high throughput distributed hashing environments with hundreds or thousands of storage servers, and thousands of concurrent users.
While embodiments are described herein by way of example for several embodiments and illustrative drawings, those skilled in the art will recognize that embodiments are not limited to the embodiments or drawings described. It should be understood, that the drawings and detailed description thereto are not intended to limit embodiments to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope as defined by the appended claims. The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include,” “including,” and “includes” mean including, but not limited to. When used in the claims, the term “or” is used as an inclusive or and not as an exclusive or. For example, the phrase “at least one of x, y, or z” means any one of x, y, and z, as well as any combination thereof.