A method of kernel smoothing has been known to be used for calculating (estimating) the occurrence probability of an arbitrary vector in a d-dimensional (d is an integer equal to or larger than 1) Euclidean space using a plurality of training vectors existing in the Euclidean space. The training vectors or an arbitrary vector have components of points in the Euclidean space (d-dimensional coordinates), which correspond to points in the Euclidean space.
In the above-described method, as indicated in Equation (1), a kernel function K is used to calculate occurrence probabilities K(x, xi) of an arbitrary vector x relative to a certain number N (N is an integer equal to or larger than 2) of training vectors xi. The total of a plurality of resulting occurrence probabilities K(x, xi) is obtained and in turn divided by N, whereby the occurrence probability P(x) of the arbitrary vector x relative to the number N of training vectors is calculated.
                              P          ⁡                      (            x            )                          =                              1                                                                      ⁢              N                                ⁢                                    ∑                              i                =                1                            N                        ⁢                                                  ⁢                          K              ⁡                              (                                  x                  ,                                      x                    i                                                  )                                                                        (        1        )            
The occurrence probabilities K(x, xi) of an arbitrary vector x relative to the training vectors xi are calculated based on the distance between the training vectors xi and the arbitrary vector x. The kernel function K actually used here, therefore, represents a function of the distance between two vectors like homoscedastic Gaussians. The kernel function K has a fixed degree of smoothing.
However, typical training vectors are distributed with some deviation rather than being distributed uniformly in the Euclidean space. An example is disclosed in D. Qin and C. Wengert and L. v. Gool, “Query Adaptive Similarity for Large Scale Object Retrieval”, Computer Vision and Pattern Recognition (CVPR), 2013. This documents describes a technique for achieving increased accuracy of calculation on the occurrence probability P(x) of an arbitrary vector x, by which the degree of smoothing is adaptively controlled using the distribution of a certain number N of training vectors xi (specifically, the distribution of the distance of the number N of training vectors xi relative to the arbitrary vector x).
In the above-described technique, however, the Euclidean space is assumed rather than a binary space.
The Euclidean space is a d-dimensional space in which values in the respective dimensions are represented with sequential values (continuous values). The values of components in d dimensions indicating points in the space (coordinates) are also represented with continuous values. By contrast, the binary space is a d-dimensional space in which values in the respective dimensions are represented in binary, which is either 0 or 1, that is, non-continuous value. The value of each component (coordinate) in the d dimensions indicating points in the binary space is also represented in binary, which is either 0 or 1, that is, non-continuous value.
The difference between the Euclidean space and the binary space disables the above-described technique to directly achieve highly accurate calculation on the occurrence probability of an arbitrary vector in the binary space using a plurality of training vectors existing in the binary space.