Certain tasks performed by computing devices can be divided into sub-tasks that can be performed independently of one another. In such cases, each of the sub-tasks can be performed in parallel, such that the independent processing of each sub-task completes at approximately the same time and reduces the overall time required to perform the task by orders of magnitude. More specifically, the overall time required to perform a task that can be divided into sub-tasks that can be performed in parallel can be directly related to the number of sub-tasks into which such a task can be divided. In some instances tasks can be divided into tens, or even hundreds, of thousands of sub-tasks. In those instances, the overall tasks can be completed tens or hundreds of thousands of times faster by performing each of the sub-tasks in parallel assuming, of course, that tens or hundreds of thousands of independent computing devices are available to compute each of the sub-tasks. The division of a task into such a large number of sub-tasks is typically referred to as massively distributed computation.
When performing massively distributed computation, the processing performed by each individual computing device is, typically, based upon a defined subset of the overall data on which the overall task is being performed. For efficiency purposes, such a subset of data resides in the local memory or storage of the individual computing device that is performing processing based on that subset. In situations where a single collection of data is being processed by multiple tasks, each of which can be divided into a large quantity of sub-tasks, it often becomes necessary to redistribute data among the computing devices that are individually processing the sub-tasks. For example, the determination of the average age of each make and model of automobile currently registered in a given area can be divided into independent sub-tasks where one computing device determines the average age of one make and model, while another computing device determines the average age of another make and model. Since the average age of one make and model of automobile is based only on the ages of the automobiles of that make and model, and is independent of the ages of any other automotive make and model, each of those exemplary sub-tasks can be performed independently of one another. To perform one such sub-task, however, a computing device can have, locally available in its own memory or storage, the registration information of each automobile that is of a specific make and model.
Should a subsequent task seek to, for example, determine the average number of miles listed on the odometers of automobiles built in specific years, it can become necessary to redistribute the data among the computing devices performing the processing. More specifically, the exemplary subsequent task can be divided into independent sub-tasks that can each, individually, determine the average number of miles listed on the odometers of automobiles built in one specific year. In order to perform such sub-tasks, however, each individual computing device can require local access, not to all of the registration information for a specific make and model, which is currently what each computing device can possess, but rather to all of the registration information for a specific manufacturing year, irrespective of the make and model of the automobile. In such an example, the data can be redistributed, or “re-partitioned”, among the computing devices, with each computing device sending to other computing devices the data it no longer needs and obtaining from such other computing devices the data that it now does need.
Such a repartitioning of data can introduce meaningful delays. In particular, the communication of large amounts of data over network communications can, even with the fastest network communications, take a substantial amount of time as compared to the amount of time spent processing such data. In addition, the partitioning of data locally, by each individual computing device, in order to determine which data is currently stored on that computing device that should be sent to a different computing device, can also introduce delays. In particular, such local partitioning of data can result in a substantial amount of randomized input/output operations, which are not efficiently performed by traditional magnetic storage media.