When working with large datasets, data is often split or partitioned across multiple partitions (e.g. disks, machines, etc.). Generally, a function is applied to some field or part of a data entry to find the partition in which it should be stored. If the amount of data grows, then the number of the partitions may be insufficient to store the data and the data may thus be repartitioned across more partitions. Current techniques for repartitioning data have exhibited various limitations. For example, repartitioning has conventionally been a time intensive process, restricted access to the data while the repartitioning is taking place (e.g. which can lead to downtime for a system), etc.
There is thus a need for addressing these and/or other issues associated with the prior art.