In computer systems, where data are frequently organized into collections of records, it has been estimated that over one quarter of the system resources are consumed to sort the records of the collections. Consumed resources include time and space. Rearranging records within computer systems is a tedious process performed by numerous, and often inefficient steps.
It is well known that sorting records is more efficient in time if the records all have the same amount of data. If the records have the same amount of data, the location of successive records can easily be computed, and out-of-sequence records can be rearranged without shifting intervening records. However, in many collections of records, the records include different amounts of data. To gain time efficiencies, some prior art solutions have equalized the amount of data in the records before sorting.
Equalization is inefficient if the amount of data in a few large records is substantially greater than the amount of data in many small records. A large quantity of space is wasted to store records, particularly all but the largest records of the collection. The records having less data could otherwise have been stored in a substantially smaller amount of space.
The costs associated with equalization can be offset by using low-cost bulk storage devices, such as disks, to store the records while they are sorted. Unfortunately, disks generally have longer access latencies than more expensive but faster semiconductor storage devices.
It is a goal of the invention to enable records having different amounts of data to be sorted without wasting a considerable amount of storage space. It is also a goal of the invention to sort the records in less time.