Conventionally, when a large amount of data is managed in a tree-structure, management by a data structure called a B-tree is performed for a majority of the cases. Since a B-tree stores multiple data entries in 1 block, as compared to a simple binary-tree, a B-tree has the advantage of narrowing the effect that a change in the tree structure has even if more data entries are added. For this reason, B-trees are often used as a data management method for disks, such as hard disks.
However, when data managed by tree structures is searched on a disk, multiple data blocks have to be read. Typically, input/output (I/O) with respect to the disk is a relatively slow process compared to memory access; consequently, data searches performed with respect to a disk are troublesome and time consuming.
For this reason, recently, countermeasures to avoid disk I/O search delays have been given consideration, such as providing a tree structure in the memory. Nevertheless, if the number of data entries becomes numerous, the amount of memory required correspondingly increases. Consequently, a method is also considered where a scheme of storing to the memory, only the portions of tree structures that will be read most often is employed (cache).
Meanwhile, recently, a data structure called a Bloom filter has come to be known. A Bloom filter is a method of efficiently finding out whether an entry belongs to an existing set. Further, in the management of electronic private branch exchange dial pulses, group processing of a pulse speed bit and an even/odd bit provided in a dial pulse has been disclosed. In addition, a method of repeated transposition and substitution by a data mixer circuit applicable for encryption and authentication has been disclosed.
A technique has also been disclosed that reduces processing time by merging a “user index” for each user, a “group index” used by multiple users, and a “system shared-index” used by all of the users. Yet another technique has been disclosed where a variable length index is added to a fixed length area and if overflow is determined, key frame information is removed from the index, establishing an available area. Refer to Japanese Laid-Open Patent Publication No. 2007-52698, Japanese Laid-Open Patent Publication No. H4-18895, Japanese Laid-Open Patent Publication. No. H7-177139, and Japanese Laid-Open Patent Publication No. 2003-289495 for examples of the aforementioned techniques.
As described, since a B-tree can handle a large quantity of data, if cache is properly implemented, disk I/O can be reduced. However, the number of disk I/O cannot be reduced beyond a given amount. Further, if the tree structure changes due to an addition of data entries, I/O for tree structure management becomes necessary. With the Bloom filter, since only the existence of a data entry is known, the Bloom filter cannot be used as is for data management.
If an index is removed when there is overflow from an available area, a bit string in the Bloom filter changes and during a search, despite actually being registered, the data is errantly determined to not be in the retrieved block. Further, despite not actually being registered, the data is errantly determined to be in the retrieved block, whereby the occurrence of false positives increases.