In a distributed hash table (“DHT”), data is organized into a set of distributed partitions that store the data. In order to write data to a DHT, a key attribute is taken from the data, the key attribute is hashed, and the resultant hash value is used to identify a partition at which the data should be stored. In order to retrieve data from a DHT, a client provides a key attribute for the data to be retrieved and the key attribute is hashed. The resultant hash value is then used to identify the partition from which the data is to be retrieved, and the identified partition is queried for the data. The partitions in a DHT can reside on different server computers to increase capacity, on multiple server computers to increase redundancy, or both, so long as a scheme exists for identifying the appropriate partition for storing, retrieving, updating and deleting data.
It is not uncommon for the partitions in a conventional DHT to be equally sized. As a result, it is also not uncommon for each partition in a conventional DHT to approach its maximum storage capacity at approximately the same time. When this occurs, one or more additional partitions must be added to increase the storage capacity, and repartitioning must be performed. For example, if a cluster of server computers storing a conventional DHT is approaching capacity, each server in the cluster is also approaching its storage capacity. To add more capacity by adding a single server to the cluster requires changing every partition maintained by the servers in the cluster. Movement of data in this manner can create a large input/output (“I/O”) load on the servers that store the DHT. So large, in fact, that adding additional hosts to a conventional DHT nearing its storage capacity may cause service outages due to the additional repartitioning I/O load.
It is with respect to these and other considerations that the disclosure made herein is presented.