1. Field of the Invention
The present invention relates to a merge sort apparatus and method and a program to execute that method on a computer.
2. Description of Related Art
In recent years, with advancements in information-based societies, large-scale databases have come to be used in various places. To search such large-scale databases, it is usual to search for a desired record, retrieving the desired record by using as indexes items within records associated with addresses at which each record is stored. Character strings in full-text searches can also be treated as index keys.
Then, because the index keys can be expressed as bit strings, the searching of a database is reduced to searching for bit strings in the database.
Furthermore, the processing of a database, as recited in the patent document 1 and patent document 2 cited below, includes merge sorting of the records in the database. This merge sort is also reduced to a merge sort of bit strings.
A basic merge sort method consists of dividing the data into pairs of 2, ordering the pair, and then combining the ordered pairs. In other words, the process is divided into an initial stage of repeatedly dividing the data to be sorted and sorting them, thus obtaining several groups of sorted data, and a later stage of repeatedly merging the sorted data, thus sorting completely the data to be sorted.
Patent document 2 discloses the processing shown in FIG. 1A, of the latter stage of the merge sort processing.
As shown in the example in FIG. 1A, sorted data is stored in block 1 to block N, and the minimum value in block 1 is 13, the minimum value in block 2 is 8 and the next largest data value is 22. In the same way, the example shows that the minimum value in block 3 is 53, the minimum value in block 4 is 24, and the minimum value in block N is 9.
In the latter stage processing of a merge sort, assuming the existence of the above described block 1 to block N, first, a minimum value array is generated from the minimum value in each block. In the example shown in FIG. 1A the minimum value array <13, 8, 53, 24, . . . , 9> is generated from the minimum values in block 1, block 2, block 3, block 4, . . . , block N. Next, that minimum value array is sorted, the sorted minimum value array <8, 9, 13, 15, . . . , 100> is generated, and the minimum value in this array “8” is output. Then, by repeating the process wherein the next data item “22” is extracted from block 2 which originally held that minimum value “8”, the insertion position of data item 22 in the sorted minimum value array is obtained, the data item 22 is inserted, and the next smallest value is output, the data stored in block 1 to block N will be merged, and the complete sort of the data is finished.
As described above, in the later stage processing of a merge sort, it is necessary to obtain the insertion position in the sorted minimum value array for the next data item. In order to get the insertion position, comparison processing is performed on the data included in the minimum value array with the next data item as the key. The insertion processing that accompanies that comparison processing is search processing of the sorted minimum value array with the next data item as the search key, in other words, the processing is reduced to bit string search processing.
Many different kinds of bit string search processing methods are known. Among those various methods, in order to perform the above-noted searching for bit strings at high speed, conventional art makes various refinements on the data structure in which bit strings are stored. One of these is a tree structure known as a Patricia tree.
FIG. 1B describes an example of a Patricia tree used for search processing in the above-noted conventional art. A node of a Patricia tree is formed to include an index key, a test bit position for a search key, and right and left link pointers. Although it is not explicitly described, a node of course includes information for the purpose of accessing a record corresponding to the index key.
In the example described in FIG. 1B, the node 1750a that holds the index key “100010” is a root node, the test bit position of which is 0. The node 1750b is connected to the left link 1740a of the node 1750a, and the node 1750f is connected to the right link 1741a. 
The index key held by the node 1750b is “010011”, and the test bit position 2030b is 1. The node 1750c is connected to the left link 1740b of the node 1750b, and the node 1750d is connected to the right link 1741b of the node 1750b. The index key held by the node 1750c is “000111”, and the test bit position is 3. The index key held by the node 1750d is “011010”, and the test bit position is 2.
The parts connected to the node 1750c by solid lines show the right and left link pointers of the node 1750c, and the left pointer 1740c that is not connected by the dotted line indicates that that field is blank. The dotted line connection destination of the right pointer 1741c that is connected by a dotted line expresses the address indicated by the pointer, and in this case this indicates that the right pointer points to the node 1750c. 
The right pointer 1741d of the node 1750d points to the node 1750d itself, and the node 1750e is connected to the left link 1740d. The index key held by 1750e is “010010”, and the test bit position is 5. The left pointer 1740e of the node 1750e points to the node 1750b, and the right pointer 1741e of the node 1750e points to the node 1750e. 
The index key held by the node 1750f is “101011”, and the test bit position 1730f is 2. The node 1750g is connected to the left link 1740f of the node 1750f and the node 1750h is connected to the right link 1741f of the node 1750f. 
The index key held by the node 1750g is “100011”, and the test bit position 1730g is 5. The left pointer 1740g of the node 1750g points to the node 1750a, and the right pointer 1741g of the node 1750g points to the node 1750g. 
The index key held by the node 1750h is “101100”, and the test bit position 1730h is 3. The left pointer 1740h of the node 1750h points to the node 1750f, and the right pointer 1741h of the node 1750h points to the node 1750h. 
In the example of FIG. 1B, the configuration is such that, as the tree is traversed downward from the root node 1750a, the test bit position of successive nodes increases.
When a search is performed with some search key, the search keys' bit values corresponding to test bit positions held in nodes are successively tested from the root node, and a judgment is made as to whether the bit value at a test bit position is 1 or 0, the right link being followed if the bit value is 1, and the left link being followed if the bit value is 0. Unless the test bit position of a link target node is larger than the bit position of the link origin node, that is, if the link target is not below but rather returns upward (the returning links described by the dotted lines in FIG. 1 being called back links), a comparison is performed between the index key of the link target and the search key. It is assured that if the result of the comparison is that the values are equal the search succeeds, but if the result is non-equal, the search fails.
As described above, although search processing using a Patricia tree has the advantages of being able to perform a search by testing only the required bits, and of it only being necessary to perform an overall key comparison one time, there are the disadvantages of an increase in storage capacity caused by the inevitable two links from each node, the added complexity of the decision processing because of the existence of back links, the delay in the search processing by comparison with an index key for the first time by returning by a back link, and the difficulty of data maintenance such as adding and deleting a node.
Art such as disclosed in the patent document 3 below exists as an attempt to solve these problems of the Patricia tree. In the Patricia tree described in the patent document 3 below, in addition to reducing the storage capacity for pointers by storing the lower level left and right nodes in contiguous regions, the back link decision processing is reduced by providing a bit at each node that indicates whether the next link is or is not a back link.
Even in the art disclosed in the patent document 3 below, however, because one node always occupies an index key region and a pointer region, and because there is one pointer by storing the lower level left and right nodes in contiguous regions, there is not that great an effect of reducing the storage capacity, for example, it being necessary to assign the same capacity to the left pointer 1740c and the right pointer 1741h, which are lowermost parts in FIG. 1B. In addition, there is no improvement of the problem of delay in search processing caused by back links, and the difficulty of adding and deleting a node.
Thus, when a merge sort is to be executed on a huge amount of data, vast amounts of computer resources are monopolized for a long time and the cost increases greatly.    Patent document 1: Japanese Published Patent Application 2000-010761    Patent document 2: Japanese Published Patent Application 2006-163565    Patent document 3: Japanese Published Patent Application 2001-357070