A distributed database system may comprise a number of computing devices, each of which may host a portion of a large dataset. Techniques such as replication and partitioning may be employed to provide scalability regarding the amount of data that may be stored and the ability of the system to respond to queries in a timely fashion. The division of data between the computing devices may, however, cause certain types of queries to be processed inefficiently.
Typically, each computing device maintains a portion of a table on a long-term storage device, such as a mechanical or solid-state drive. The range of data stored on each storage device is typically fixed, due to the amounts of data usually involved and the time and complexity involved in transferring data from one partition to another. However, in many distributed database systems it may be necessary to transfer data between partitions in order to utilize additional storage.