Sorting is generally acknowledged to be one of the most time consuming procedures for which computers are used. The amount of work recently published in this area clearly indicates the importance of efficient sorting techniques. Knuth, in The Art of Computer Proqramming, Vol. 3, p. 3, Addison-Wesley Publishing Co., 1973, stated that computer manufacturers estimate over twenty-five percent of all computer running time was then being spent on sorting. It is believed that the figure is at least as high, and probably higher, today. Many installations utilize more than half of their computing time on sorting. Rearranging data within a computer is essentially a tedious task performed by numerous complicated and often inefficient steps.
In order to reduce the amount of time spent on sorting, numerous proposals and improvements have been set forth in the prior art. These prior art procedures, which sort a group of N data records in a main file into a specified sequence, as determined by an identifying key assigned to each record, generally fall into one of three categories based on the time to sort N records. However, a procedure most efficient for a certain number of records is often inefficient and unnecessarily time consuming for a different number of records. In the first category, the sort time, or time to sort a file of N records, is approximately equal to O.sub.s N.sup.2, where O.sub.s is a constant. This type of sort, including insertion, selection, and bubble sorts, is generally the simplest type and is most efficient for relatively small N, for example, 10 or less. Although other sorting procedures are more efficient for N larger than 10, say between 10 and 100, the overall time to sort this moderately small file is generally so insignificant as not to warrant the use of a more complex procedure within that range of N.
In the second category of sorting procedures, the sort time is approximately equal to O.sub.m Nlog.sub.2 (N), where O.sub.m is a constant larger than O.sub.s. This category includes Quicksort and Heapsort and is generally the most efficient type of sort for N between 10 and 100,000 records. Quicksort is a standard internal computer sort and is generally regarded as the best all purpose sort, however, its efficiency can decrease dramatically for non-random data records. A presorted file may be the worst case for Quicksort, where the sort time may be proportional to N.sup.2.
The third category of sorting procedures has a sort time approximately equal to O.sub.l N, where O.sub.1 is a constant larger than O.sub.m. This category generally includes all types of bin sorting, such as radix sorting, and is generally the most efficient sort for over 100,000 records. Bin sorting typically starts by sorting the data records according to either the least significant digit (LSD) or the most significant digit (MSD), of the key, and places each key into a bin corresponding to the character value. In a LSD type bin sort, after all the keys have been sorted on the first LSD, they are then sorted on the next higher digit, taking care not to disturb the then-existing order among the keys. This process is repeated until the MSD is sorted, at which point the keys will be completely sorted. The LSD method requires that each key will be processed a number of times equal to the number of digits in the key, since it starts at the last (i.e., "least significant") digit, and cannot be finished until the most significant digit has been processed. This is an economical method when the keys are short relative to the number of records in the file because under those conditions all the digits in most of the keys will be used at least once regardless of the sorting method employed.
In an MSD type bin sort, however, after all the keys have been sorted on the first MSD, the procedure distinguishes between the keys in the separate bins and sorts the keys in the first bin according to the second MSD. This process continues until all keys from the first bin have been completely sorted, at which point sorting will continue with the second MSD on the keys from the second bin. After all the keys which were in the second bin have been completely sorted, the sorting will continue with the second MSD on the keys from the third bin, and so forth, until all the keys are completely sorted. By treating separately the groups of keys which have already been distinguished on their most-significant-digits, the MSD method has an advantage over the LSD method when the least significant part of the keys is not needed to order the records, as is frequently the case. Thus, the MSD method has an expected time of O.sub.m Nlog.sub.2 N, and a worst case time of O.sub.l N.
All known sorting procedures perform ineffective and time consuming steps some of the time. Prior sorting procedures perform a process of ordering keys even when it is not necessary to do so, resulting in excessive time spent on sorting. Prior art methods also tend to become very inefficient for those parts of a sort in which the number of actual characters to be sorted in a given iteration is much smaller than the maximum number of characters possible (which corresponds to the number of available bins). In both cases, processing steps and therefore time are unnecessarily spent. In addition, many prior sorting procedures can only sort in an ascending or descending order and are not capable of dealing efficiently with the internal representation of numbers such as negative or exponential numbers or other unusual lexicographic sequences. For example, they may not efficiently do alphabetic sorts in which the lower and upper case appearances of the same letter are to be grouped together, or handle numerical sorts where the absolute value of the numbers are to be sorted on,
Because of the importance of sort procedures, the art has always sought more efficient and reliably faster sorting methods. Moreover, there is a need for a new versatile sorting procedure which is more efficient and faster prior art sorting procedures over a much larger range of N. It is an object of the present invention to provide a sorting method in which significantly increased time is not required to sort each item due to a pre-existing order among the items (i.e. a departure from pure randomness) as is the case with Quicksort and other comparable sorts, or the nature of the information associated with each key, or changes in the total number of items N. It is a further object of the invention to provide a sort method in which the time needed to sort each record decreases significantly when the items are originally partly pre-sorted. Stability is also an object of the present invention, i.e., to retain the original order of any two adjacent records, during the sort, when comparison of their corresponding keys show them to be equal. A further object of the present invention is to provide a sorting method in which only a moderate amount of working space or memory is required and the sorting procedure is comprised of a minimum number of steps. A still further object is to provide a sorting method having the ability to deal efficiently with the internal representation of numbers or other unusual lexicographic sequences, and variable length records and keys.