This invention relates to an approach to efficient computation of inner products, and in particular relates to efficient inner product computation in image or video processing.
A number of image and video analysis approaches involve computation of feature vector representations for an entire image or video, or portions (e.g., spatial patches) of such representations. Approaches to determining similarity of vector representations include distance-based and direction-based approaches. An example of a distance-based approach uses a Euclidean distance (i.e., square root of the sum of squared differences of corresponding elements of the vectors), while an example of a direction-based approach uses an inner product metric (i.e., a sum of the products of corresponding elements of the vectors). Some approaches involve projection of a vector representation unto a basis vectors from a predetermined set. Such projections also involve inner product calculations.
Projection approaches include basis selection approaches in which the basis vectors to represent a particular feature vector are selected from a larger predetermined “dictionary” of basis vectors. One such approach is called “Orthogonal Matching Pursuit (OMP)” in which a series of sequential decisions to add basis vectors for the representation are made. These decisions involve computations of inner products between the as-yet unselected basis vectors from the dictionary and a residual vector formed from the component of the feature vector not yet represented in the span of the selected basis vectors from the dictionary.
One prior approach to computation of an inner product between two vectors u and ν uses a random projection technique. The Johnson-Lindenstrauss theorem is a basis for “Location Sensitive Hashing” (LSH) for a given a data vector ν, a bit vector h(ν)ε{0,1}p is computed such that
            h      i        ⁡          (      v      )        =      {                                                      1                                                                                            r                    i                    T                                    ⁢                  v                                ≥                0                                                                        0                                      otherwise                                      ⁢                                  ⁢        i            ∈              1        ⁢                                  ⁢        …        ⁢                                  ⁢        p            Here, ri's are random projection vectors, and p is the number of projections. Let └x┘ denote an operator such that └x┘=1 if x≧0 else └x┘=0. Let P be a projection matrix of random vectors P=[r1 . . . rp]T. We can write the bit-vector construction as h(ν)=└Pν┘.
As a consequence of the Johnson-Lindenstrauss theorem, the dot-product between two data vectors, u and ν, can be approximated with the hamming distance between their bit vectors, ∥h(u)−h(ν)∥1
            u      T        ⁢    v    ≈                            u                    2        ⁢                          v                    2        ⁢          cos      ⁡              (                  π          ⁢                                                                                                          h                    ⁡                                          (                      u                      )                                                        -                                      h                    ⁡                                          (                      v                      )                                                                                                  1                        p                          )            
Another prior approach provides a way of choosing P to be sparse, with non-zero entries that are ±1. An approach referred to as “Comparison Random Projection” (CRP) uses a construction of P as:
      P    ij    CRP    =                    q        m              ⁢          {                                                  +              1                                                          with              ⁢                                                          ⁢              probability              ⁢                                                          ⁢                              1                /                2                            ⁢              q                                                            0                                                              with                ⁢                                                                  ⁢                probability                ⁢                                                                  ⁢                1                            -                              1                /                q                                                                                        -              1                                                          with              ⁢                                                          ⁢              probability              ⁢                                                          ⁢                              1                /                2                            ⁢              q                                          for example, where q=1 or 3. Because the projection Pν does not require multiplications, the overall computation is reduced.
Another prior approach provides a way of choosing P as a product of a sparse random projection (SRP) matrix PSRP with s non-zero elements per row drawn from normal distribution multiplied by a Hadamard matrix H and a random ±1 diagonal matrix D as:PFJLT=PSRPHD 
Note that a sparse feature vector ν applied to the projection PFJLTν can be computed as PFJLT=PSRP (H Dν)=PSRP {tilde over (ν)} where {tilde over (ν)}=HDν has the effect of making {tilde over (ν)} non-sparse even if ν is sparse.
There is a need for computationally efficient approaches to determining inner products between feature vectors, and more specifically, there is a need for efficient and accurate basis selection for techniques such as Orthogonal Matching Pursuit.