This invention relates broadly to an electronic data processing system, and more specifically, to a method of reducing the time and computer resources required for sorting large quantities of data in such a system.
In sorting statistical, business, scientific, and other data, the use of a general purpose, electronic digital computer is often required for relatively long periods of time during which the process of sorting, not uncommonly requires a substantial proportion of available computer resources. This, taken together with the increased use of computers and the often voluminous and complex nature of data being processed, has made the development of a method for efficiently and rapidly sorting data increasingly important. Sorting, in general, is comprised of two major steps; namely string generation and merging. String generation is a process of accepting unordered data and forming this data into groups of ordered or sequenced data, commonly designated strings. Merging is a process of combining the generated strings into larger strings until one string or one set of strings, as desired, remains. Thus, the input body of data, after the completion of the string generation and merging steps, is in the form of a sequenced or ordered body of data. Generally, the string generation process operates only once on the input data, while the merging process operates on the data as often as necessary to arrive at a single string or a desired set of strings.
In the past, many different sorting techniques were developed, each of which was primarily designed to minimize the execution time of the merging process. Examples of a number of sorting techniques that have been developed are discussed in Computer Sorting, Ivan Flores, Prentice Hall, 1969 and "Some improvements In A Technology of String Merging and Internal Sorting", Martin Goetz, Conference Proceeding of the American Federation of Information Processing Societies, Volume 25, pages 599-607, 1964. Because of the availability of random access devices in the widely varying types of data processing systems in use today, there is now a vehicle that can be utilized to reduce the data processing resources presently required in the sorting of large quantities of data. This sorting process presently constitutes a major portion of the workload of present day computer systems. Depending on the system, it may be desired to minimize the time required for throughput, i.e. the total sorting time. The requirements of other systems such as time-sharing systems may require that the resources of the central processing unit be minimized, or the number of calls to the operating systems be minimized. Yet other data processing systems may need to minimize the input/output time and equipment, thereby economizing on such peripheral units as magnetic discs and drums. A flexible sorting method is therefore needed that not only materially reduces computer resources required for sorting but also selectively optimizes the allocation of such computer resources as peripheral equipment, input/output calls, and input/output times.