Modern computer vision applications are widely used for a range of practical purposes including image registration, object recognition and 3D reconstruction to name a few. Derivation of mathematically processable representation of an image is a prerequisite of any such application or algorithm. In most existing models such representation is created using the following stages:                1. Interest point detection, i.e. detection of image features robust to noise, detection errors and geometric and photometric deformations.        2. Feature description, i.e. creation of distinctive mathematical description providing the means to numerically measure their similarity.        3. Matching of pairs of feature descriptors based on their similarity scores, e.g. Euclidean distances between descriptor vectors.See FIG. 3.        
Historically, most interest point detectors are based on first and second spatial derivatives. Most notable representatives are Harris corner detector [Harris C., 1988] using second-moment matrix and its scale invariant modifications, such Laplacian of Gaussian, SIFT's difference of Gaussians [Lowe, 2004] and Fast-Hessian detector used in SURF [Herbert Bay, 2008]. Among these, SURF's Fast-Hessian detector shows the best performance while producing output robust to distortions and lighting changes. Out of non-Harris-based detectors, FAST [Rosten and Drummond, 2006] is worth mentioning. FAST outperforms SURF, but is not scale-invariant. As a result of interest point detection, pair of coordinates and scale (x,y,σ) are produced and passed as an input to feature descriptor.
Most notable feature descriptor implementations are SIFT and SURF [Lowe, 2000, Funayama, 2012], that are considered industry standard for modern applications. However, the size of these descriptors varies from 144 to 512 bytes which imposes serious memory consumption limitations on large scale applications. Theoretically, a 144-byte (1152-bit) descriptor may distinguish between 21152 different objects. This is far more than required in any practical application. Furthermore, SIFT and SURF descriptors are based on vectors of floating point numbers, therefore distance computation requires a lot of time-consuming floating point operations.
Accordingly, what is desired, and not heretofore been developed, is a feature descriptor that uses only bitwise and integer operations for distance computation and has a significantly smaller memory footprint. Developing such a descriptor would considerably speed up the matching step for large image databases.