This invention relates to image analysis using selective basis feature representations, and more particularly using histograms of coefficient values determined using Orthogonal Matching Pursuit processing of feature vectors.
A number of image and video analysis approaches involve computation of feature vector representations for an entire image or video, or portions (e.g., spatial patches) of such representations. One application of such features is classification based on collections of features, for example, scene classification using a collection of feature vectors determined from the image or video.
In some approaches to computation of feature vectors involves first computing a direct feature vector, for example, as a vector of pixel values or D-SIFT features, and then determining a representation of that direct feature vector in another basis using a projection approach. Projection approaches include basis selection approaches in which the basis vectors to represent a particular feature vector are selected from a larger predetermined “dictionary” of basis vectors. One such approach is called “Orthogonal Matching Pursuit (OMP)” in which a series of sequential decisions to add basis vectors for the representation are made. These decisions involve computations of inner products between the as-yet unselected basis vectors from the dictionary and a residual vector formed from the component of the feature vector not yet represented in the span of the selected basis vectors from the dictionary.
Generally, the OMP approach can be summarized as follows. A dictionary Φ=[ai; i=1, . . . , n] with aiεm such that m<<n and m=Span(Φ) is predetermined before processing the directly computed feature vectors, which have dimension m. Very generally, the OMP process involves an iteration selecting vectors ak1, ak2, . . . from the dictionary for representing a feature vector v such that at the pth iteration, kp is chosen such that
      k    p    =      arg    ⁢                  ⁢                  max        k            ⁢                                            a            k            T                    ⁢                      v                          p              -              1                                                  where vp is the residual (I−PSp)v where PSp is a projection onto the span of Sp={ak1, . . . , akp}, and v0=v. The coefficients of the selected dictionary entries are selected to optimize ∥v−ΦT α∥ where α has non-zero entries at the selected elements k1, k2, . . . kp.