A conventional approximate nearest neighbor search method using a hash function is disclosed in P. Indyk and R. Motwani, “Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality,” In Proceedings of the 30th ACM Symposium on Theory of Computing (STOC'98), pp. 604-613, May 1998) and M. Datar, P. Indyk, N. Immorlica, and V. Mirrokni, “Locality-Sensitive Hashing Scheme Based on p-Stable Distributions,” In Proceedings of the 20th Annual Symposium on Computational Geometry (SCG2004), June 2004). As shown in FIG. 5, these documents disclose that a training pattern set is projected on an arbitrary vector, a hash function for dividing an existence range on the vector at constant intervals is defined, and a nearest neighbor pattern is searched from limited nearest neighbor candidates in the training pattern set by a hash value of an input pattern. In FIG. 5, the horizontal axis shows a projected vector, the vertical axis shows the cumulative frequency of training patterns on this vector.
As shown in FIG. 5, stored patterns are projected on an arbitrary vector, and a hash function for dividing the existence range on the vector at constant intervals is used.
However, in a space area (hereinafter referred to as a bucket) obtained by division by the hash function, with respect to the training patterns existing in the bucket containing the input pattern as the search object, the number of patterns varies according to the distribution of the training pattern set, and in the bucket with a high training pattern density, a search time becomes long, and in the bucket with a low training pattern density, a ratio (hereinafter referred to as an error ratio) of errors of distances of the true nearest neighbor pattern and the obtained approximate nearest neighbor becomes high.
In the case where the training pattern does not exist in the bucket containing the input pattern as the search object, a search can not be made.