Conventionally, as one of data retrieval methods, a data retrieval method that uses a hash table is available.
Here, the hash table is one kind of a table type data structure that uses a hash function as measures for determining a position of a table into which data is to be stored or measures for determining a position of the table from which data is to be acquired.
A data retrieval method that uses a hash table is advantageous in that the data retrieval efficiency is high, namely, the time required to retrieve retrieval target data or to decide that no retrieval target data exists is short in comparison with other data retrieval methods (for example, a binary tree search method).
On the other hand, the data retrieval method that uses a hash table uses a hash table produced by storing data into a storage position specified based on a hash value calculated by a hash function. The hash value calculated by the hash function may be equal between different data, and, in this case, the data retrieval efficiency degrades depending upon in what manner the data are stored into a hash table and retrieved.
On the other hand, it seems a possible idea to use a perfect hash function that guarantees that hash values do not overlap with each other among all data stored in the hash table. When a hash table produced using a perfect hash function is used, since storage positions do not overlap with each other among different data, a storage position can be specified uniquely upon data retrieval, and the data retrieval efficiency is the highest in theory and the highest data retrieval efficiency is guaranteed.
However, it is not easy to determine a perfect hash function for a given data set. Specifically, it is not easy to determine a perfect hash function for a large-scale data set. For example, the calculation cost when a perfect hash function is calculated by a brute force method increases exponentially with respect to increase of the number of data.
On the other hand, as a method for suppressing the calculation cost when a perfect hash function is calculated, a CHD algorithm is available wherein a large-scale data set is divided into a plurality of small-scale data sets and an individual perfect hash function is calculated for each of the plurality of small-scale data sets.