Data sorting is one of the basic processings using a computer. Sort processing and merge processing are operations for receiving one series of data strings by sorting the data strings which were input in ascending order (from small to large) or descending order (from large to small) on the information on the target field. In the merge operation, two or more series of data (sort target data or record) strings sorted in ascending order (or descending order) are received, and one series of data strings sorted in ascending order or descending order is output.
The number of input series varies, as in 2, 3, 4, . . . , but the case of 2 is particularly efficient since one unit of a data sort position can be defined by one comparison.
Since the order of time required for merge processing is in proportion to the number of data (n), the merge operation is advantageous compared with the sort operation, of which the order of required time is in proportion to n2 or nlog2n. However, the required time increases as the data count increases, so a high-speed algorithm (calculation method) is demanded. The development of a high performance algorithm may expand the merge/sort application range. But this operation is simple, the comparison of two key values, so the possibility of finding factors to decrease time is low. The present invention would allow the parallelism of merge processing using two sorted partial data string pairs as input, which has been considered difficult in conventional technology.
Decreasing processing time by performing merge processing on volumes of data simultaneously in parallel using a parallel processor has been proposed. However, in many cases a special topology is required for the connection network between processors or a processor with special functions and structure is required, and this implementation is difficult for a general purpose computer system, such as a tightly-coupled multi-processor sharing a main storage. The present invention implements a merge/sort method, which has high parallelism even when a general purpose processor, that has no special configuration or connection, is used, but if such a method is not used, processing efficiency becomes very low, as the following examples show.
As an example of a parallel merge/sort method that can be applied to a general purpose parallel processor system, the parallel two-branching merge/sort calculation method will be described. FIG. 13 is a flow chart thereof, and FIG. 14 is a diagram depicting the processing when the number of processors is eight.
With reference to FIG. 13, the processing target unsorted data string (input data) is divided into p sets of data strings (where p=2q, q is an integer), which corresponds to the number of processors (S100). Using the p number of processors, the divided p sets of unsorted data strings are sorted independently and in parallel by a quick sort method, for example (S101). By performing q steps of merge processing for the p sets of sorted partial data strings (S102), one set of sorted data strings can be finally acquired as a whole (S103).
With reference to FIG. 14, processing when the number of processors is eight will be described. In FIG. 14, a circle indicates a processor, and a square indicates a data string or an area where a data string (D) is stored. The symbol in the circle indicates the content of processing to be executed by the processor, where S indicates sort processing, M indicates merge processing, and V indicates transfer processing to another storage area, which is executed when necessary.
The unsorted data strings in the input area are divided into eight partial data strings. The eight processors P1-P8 execute sort processing of these partial data strings simultaneously in parallel. The processing results are stored in the areas D11-D18.
Then the merge processing in the first step is executed. The data strings D11 and D12 are merged by the processor P1 and stored in the area D21, the data strings D13 and D14 are merged by the processor P3 and stored in the area D22, the data strings D15 and D16 are merged by the processor P5 and are stored in the area D23, and the data strings D17 and D18 are merged by the processor P7 and stored in the area D24. In the first step, the processors P2, P4 and P8 are not used. (Here processors are assigned for the sake of convenience. This is the same for the description herein below.)
Then the merge/sort in the second step is executed. The data strings D21 and D22 are merged by the processor P1, and stored in the area D31, and the data strings D23 and D24 are merged by the processor P5 and stored in the area D32. In the second step, the processors P2, P3, P4, P6, P7 and P8 are not used.
Then the merge/sort in the third step is executed. The data strings D31 and D32 are merged by the processor P1 and stored in the area D4. The merge/sort is now completed. In the third step, the processors P2, P3, P4, P5, P6, P7 and P8 are not used. The acquired result, D4, is transferred to the final output area by the eight processors if necessary.
In the above-mentioned conventional method, processors which are not used increase as the merge processing steps advance, so the processing capability of the processors is wasted. In the case of the above example, only 50% of the processors are used in the merge processing in the first step, and this processor utilization ratio is 25% in the merge processing in the second step, and 12.5% in the merge processing in the third step, thus the number of processors which are not used increases as the processing steps advance. This is because the number of sorted partial data strings decreases to half each time the merge processing steps advance.
Also the number of data to be merged by one processor increases to double as the steps advance, so processing time increases. In the case of a system which gives priority to high-speed processing, this increase of processing time, due to the increase of the data volume to be handled by one processor, becomes a more serious problem than the problem of a drop in the utilization ratio of the processors.
With solving the above problems in view, it is an object of the present invention to provide a parallel merge/sort processing device, method, and program which can increase the utilization efficiency of the processor in the merge/sort processing using parallel processors, and can decrease processing time.