The exemplary embodiment relates to comparison of database objects based on representations of the objects and finds particular application in the retrieval of images from a database in response to a query image.
To compare digital objects, such as images, a representation of one object (the query object) can be generated, which can then be compared with a set of similarly generated representations of database objects in order to find the most similar. These representations may be referred to as signatures since they represent the respective objects. In the case where the objects are document images, they may be represented by runlength histograms (vectorial representations) or other representations generated by various description techniques. For example, it could be useful to be able to retrieve scanned images of a particular structured form which have been filled in and submitted by users. In this case, the query object may be a representation of a template (blank) version of the form and the database could include representations of a large collection of scanned documents, potentially including scanned images of particular form.
Such a comparison process may entail computing the similarity between one query object and a very large number (millions or even hundreds of millions) of database objects. When dealing with a large number of database objects, two issues may be considered. The first is the computational cost. To reduce the cost, the similarity between two objects should be computed efficiently. The second consideration is the memory cost. Ideally, the memory footprint of the database objects should be small enough so that all objects fit into the memory storage (RAM) of a computer performing the comparison. If this is not the case, a significant portion of the dataset may have to be stored on external/removable memory, such as a disk. As a result, the response time of a query can be too long because the disk access is much slower than that of RAM access. Although solid state disks (SSD) can be as fast as RAM, they tend to be costly, which limits their deployment.
For example, even when relatively small descriptors of 1,680 dimensions are used (e.g., in the case of runlength histograms as document representations) and encoded on 4B floating point values, each document signature takes 6720B and a million documents would need approximately 6.23 GB of memory storage.
To address these two interrelated issues, several binary embedding techniques have been proposed: the image signatures are transformed into a binary space where the Hamming distance (which counts the number of dissimilar bits in two binary signatures) makes sense. The compression results in a binary representation of an image without losing the ability to retrieve visually similar images. See, for example, P. Indyk, et al., “Approximate nearest neighbors: towards removing the curse of dimensionality,” in STOC '98: Proc. 30th annual ACM Symp. on Theory of Computing, pp. 604-613 (ACM New York, N.Y., USA, 1998); M. Charikar, “Similarity estimation techniques from rounding algorithms,” in ACM Symp. on Theory of Computing (2002); Y. Weiss, et al., “Spectral hashing,” in Neural Information Processing Systems (hereinafter NIPS) (2008); B. Kulis, et al., “Kernelized locality-sensitive hashing for scalable image search,” in Proc. 12th Int'l Conf. on Computer Vision (2009); M. Raginsky, et al., “Locality-sensitive binary codes from shift-invariant kernels,” in NIPS (2009); J. Wang, et al., “Semi-supervised hashing for large scale search,” in IEEE Conf. on Computer Vision & Pattern Recognition (June 2010).
Binary embeddings address both the computational and memory issue. The Hamming distance Ha(a,b) between two binary signatures a and b may be computed efficiently using binary operations and look-up tables. Also, the memory footprint can be drastically reduced. These techniques suffer from a major disadvantage. They require compression of the query object so that the query object is in the same space as the database object representations. Therefore, there is a loss of information when the query object is compressed.
The exemplary embodiment addresses this problem and others with an asymmetric embedding approach in which the loss of information on the query side can be reduced while maintaining the benefits of a quantized compression (e.g., binary embedding) on the database side.