1. Field of the Invention
The present invention relates to a search apparatus, search method, and program that searches for a desired bit string from a set of bit strings, and more particularly to a field of art in which refinement is done to the data structure in which bit strings are stored to effect an improvement in search speed and the like.
2. Description of Related Art
In recent years, with advancements in information-based societies, large-scale databases have come to be used in various places. To search such large-scale databases, it is usual to search for a desired record, retrieving the desired record by using as indexes items within records associated with addresses at which each record is stored. Character strings in full-text searches can also be treated as index keys.
Because the index keys can be expressed as bit strings, the searching of a database is equivalent to searching for bit strings in the database.
In order to perform the above-noted searching for bit strings at high speed, conventional art makes various refinements on the data structure in which bit strings are stored. One of these is a tree structure known as a Patricia tree.
FIG. 1 shows an example of a Patricia tree used for search processing in the above-noted conventional art. A node of a Patricia tree is formed to include an index key, a test bit position for a search key, and right and left link pointers. Although it is not explicitly shown, a node of course includes information for the purpose of accessing a record corresponding to the index key.
In the example shown in FIG. 1, the node 1750a that holds the index key “100010” is a root node, the test bit position 1730a of which is 0. The node 1750b is connected to the left link 1740a of the node 1750a, and the node 1750f is connected to the right link 1741a of the node 1750a. 
The index key held by the node 1750b is “010011”, and the test bit position 1730b is 1. The node 1750c is connected to the left link 1740b of the node 1750b, and the node 1750d is connected to the right link 1741b of the node 1750b. The index key held by the node 1750c is “000111”, and the test bit position is 3. The index key held by the node 1750d is “011010”, and the test bit position is 2.
The parts connected to the node 1750c by a solid lines show the right and left link pointers of the node 1750c, and the left pointer 1740c that is not connected by the dotted line indicates that that field is blank. The dotted line connection destination of the right pointer 1741c that is connected by a dotted line expresses the address indicated by the pointer, and in this case this indicates that the right pointer points to the node 1750c. 
The right pointer 1741d of the node 1750d points to the node 1750d itself, and the node 1750e is connected to the left link 1740d. The index key held by 1750e is “010010”, and the test bit position is 5. The left pointer 1740e of the node 1750e points to the node 1750b, and the right pointer 1741e of the node 1750e points to the node 1750e. 
The index key held by the node 1750f is “101011”, and the test bit position 1730f is 2. The node 1750g is connected to the left link 1740f of the node 1750f and the node 1750h is connected to the right link 1741f of the node 1750f. 
The index key held by the node 1750g is “100011”, and the test bit position 1730g is 5. The left pointer 1740g of the node 1750g points to the node 1750a, and the right pointer 1741g of the node 1750g points to the node 1750g. 
The index key held by the node 1750h is “101100”, and the test bit position 1730h is 3. The left pointer 1740h of the node 1750h points to the node 1750f, and the right pointer 1741h of the node 1750h points to the node 1750h. 
In the example of FIG. 1, the configuration is such that, as the tree is traversed downward from the root node 1750a the test bit position of successive nodes increases. When a search is performed with some search key, the search keys' bit values corresponding to test bit positions held in nodes are successively tested from the root node, and a judgment is made as to whether the bit value at a test bit position is 1 or 0, the right link being followed if the bit value is 1, and the left link being followed if the bit value is 0. Unless the test bit position of a link target node is larger than the bit position of the link origin node, that is, if the link target is not below but rather returns upward (the returning links shown by the dotted lines in FIG. 16 being called back links), a comparison is performed between the index key of the link target and the search key. It is assured that if the result of the comparison is that the values are equal the search succeeds, but if the result is non-equal, the search fails.
As described above, although search processing using a Patricia tree has the advantages of being able to perform a search by testing only the required bits, and of it only being necessary to perform an overall key comparison one time, there are the disadvantages of an increase in storage capacity caused by the inevitable two links from each node, the added complexity of the decision processing because of the existence of back links, delay in the search processing by comparison with an index key for the first time by returning by a back link, and the difficulty of data maintenance such as adding and deleting a node.
In order to resolve these disadvantages of the Patricia tree, there is, for example, the technology disclosed in Patent Reference 1 below. In the Patricia tree described in Patent Reference 1 below, by storing lower level sibling nodes in a contiguous area, the space need for pointers is reduced as well as by setting a bit in each node to show whether or not the next link is a back link the determination processing for back links is reduced.
However, even in the disclosure of Patent Reference 1 below, since each node always reserves an area for the index key and the area for a pointer, and a single pointer is used for storing lower level sibling nodes in a contiguous area as shown for example even in the parts of left pointer 1740c, right pointer 1741h, etc. that are the lowest level parts of the Patricia shown in FIG. 17, the same amount of space must be allocated, etc., and there is not a very big space reduction effect. Also the problem of the delay in the search processing caused by a back links, and the difficulty of processing such as adding and deleting, etc., is not improved.
Also, if a record is to be searched for in a database, not only are searches performed with the values of items corresponding 1 to 1 with database records but are normally performed with the values of arbitrary items that compose a record as a search key. Because the values of those items, depending on the record, are not restricted to being unique, searches are performed with duplicate keys in a plurality of records. One example of handling such duplicate keys is cited in Patent Reference 2 below.
Patent document 1: Japanese Published Patent Application 2001-357070.
Patent document 2: Japanese Published Patent Application H11-96058.