The present invention relates to method and system for sorting information having a number of counts to be handled by a computer.
In recent years, as the quantity of information to be handled by the computer has increased, a technique to improve an efficiency of information handling by the computer in order to overcome a problem of long processing time has become important. In one method which is frequently used to improve the efficiency of processing of the information to be processed by the computer, a most efficient processing sequence is recorded by counts, the information is sorted in accordance with the counts, and then actual information processing is carried out. In accordance with this method, repeated reference to the same information is avoided as much as possible so that the processing time of the computer is significantly reduced.
For example, in a hierarchical information processing method which reduces the processing time of the computer, hierarchical information represented by counts is added to input information as a pre-process, and the input information is sorted in accordance with the counts. Since the information in the same hierarchical level is arranged closely by the above process, it is not necessary to search all of the information to retrieve the information in one hierarchical level if the information in that level is to be processed; rather, the sorted information may be referred sequentially to process it sequentially. Accordingly, the retrieval time can be saved.
Another example is determination of a close relationship of patterns in a computer geometry. In a process to determine the close relationship of the patterns, coordinate information of a pattern is read from input information and a pattern having close coordinate is retrieved from the input information. If the coordinates of the pattern information are deemed counts and the input information is sorted in accordance with the counts, the close pattern can be searched by merely reading out the input information sequentially. Thus, the efficiency of the processing is improved and the processing time of the computer is saved.
As described above, since the sorting of the information having the counts is an important technique to improve the efficiency of information processing, various techniques to improve the efficiency of the sorting per se of the information having the counts have been studied.
One of most efficient sorting methods for the information having the counts in a conventional computer is a Quick-Sort method as described by R. L. Wainwright in "A Class of Sorting Algorithm Based on Quick sort", Communications of the ACM, Vol. 28, No. 4, pp 396-402, April 1985 and a Radix-sort method mentioned by A. V. Aho in "Data Structure and Algorithms," Addison Wesley Publishing Company Reading, Massachusetts, 1983.
In the Quick-Sort method whether a count (sort key information) which is used as a reference to rearrange an information unit of information to be sorted is larger or smaller than an appropriately set intermediate value is checked, and a bi-splitting process of the information is repeated to attain sort result information. The bi-splitting process, is repeated for each of the bi-split information until individual information units are obtained in the n-th step. Thus, assuming that N information units are to be sorted by bi-splitting the N information units n times until each split area includes only one information unit, n is approximately represented by EQU n=log.sub.2 (N)
Since the split process is done for all of the N records of the information to be sorted, the number of steps or the computer processing time T required to sort all the records by the Quick-Sort method is approximately EQU T=Cq N log.sub.2 (N)
where Cq is a proportional constant. Accordingly, in the prior art sorting technique of the information having the counts in the computer, as the number of record N increases, the cost of the computer increases in proportion to N log.sub.2 (N).
In the Radix-Sort method, an address memory for addressing the sort key which is used as the reference to rearrange the information to be sorted is provided, and the information to be sorted corresponding to the sort keys is sequentially read, one information unit at a time, and written into list information 7 controlled by the address memory. Finally, the listed information is read from the beginning and the sorting is completed. It is necessary to additionally write pointer information which points the beginning of the next information of the list in order to control the list information.
Assuming that N information is to be sorted and maximum count is K, the number of steps or the computer processing time T required to sort all of the information is approximately EQU T=Cr (N+K)
Thus, the computer processing time T is improved over that of the Quick-Sort method. However, in the latter method which requires the pointer control, a waiting time for the processing may become long depending on the capacity of the main memory of the computer because the continuity of the data is not assured. Further, the pointer control requires an essentially complex process.
In the prior art sorting method of the count information in the computer, either the time-consuming complex method or method which requires the pointer control and many steps is used. Thus, where the quantity of information is larger and the count information to be sorted is complex, the increase of the computer cost was unavoidable.