In the field of searching for similar data using feature vectors representing the feature of data such as a fingerprint, an image, and sounds, related techniques that moderate stringency to speed up search processing are known. In one of such techniques, for example, feature vectors are converted into binary strings with keeping the distance between the feature vectors to calculate the Hamming distance between the binary strings so that calculation cost can be reduced.
As a technique to convert feature vectors into binary strings with keeping the distance between the feature vectors, the locally-sensitive-hashing (LSH) is known. For example, an information processing device determines a plurality of hyperplanes which divides a feature vector space and converts feature vectors into binary strings each indicating plus or minus of the inner product of the normal vector of each hyperplane and the feature vector. In other words, the information processing device divides the feature vector space into a plurality of regions using hyperplanes and converts feature vectors into binary strings indicating to which divided region does the feature vector exist.
When a label representing similarity among data, such as an ID for identifying an individual who registered the data, is appended to data, a hyperplane for classifying data by a label is preferably determined to simplify classification of a newly registered data. A technique is known that uses a pair of data appended with different labels to learn a set of hyperplanes that classify data by labels.
For example, an information processing device randomly selects one of feature vectors as a reference vector and then selects a feature vector having the highest similarity with the reference vector among feature vectors appended with a label different from the label appended to the reference vector. Then by learning a hyperplane that separates the selected two feature vectors, the information processing device determines a hyperplane near the boundary between data appended with different labels.    Non Patent Document 1: M. Datar, N. Immorlica, P. Indyk, V. S. Mirrokni: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions, Proceedings of the twentieth annual symposium on Computational geometry (SCG 2004)    Non Patent Document 2: M. Norouzi and D. Fleet: Minimal Loss hashing for compact binary codes, Proceedings of the 28th International Conference on Machine Learning (ICML '11)    Non Patent Document 3: Ran Gilad-Bachrachy Amir Navotz Naftali Tishbyy: Margin Based Feature Selection—Theory and Algorithms (ICML 2004)
In the aforementioned technique of learning a hyperplane, a hyperplane that separates a randomly selected reference vector and a feature vector having the highest similarity with the reference vector among feature vectors appended with a label different from the label appended to the reference vector is learned. The technique is disadvantageous in that a hyperplane that comprehensively classifies feature vectors is not learned.
For example, a group composed of data having the same label as the reference vector is determined as a reference group. The information processing device learns a hyperplane that locally classifies the group, composed of data appended with a label different from that of data included in the reference group, adjacent to the reference set and the reference group. If there is other group composed of data appended with a label different from that of data included in the reference group, it is desirable to learn a hyperplane that separates the feature vector space more comprehensively to classify a larger number of groups.