The process of performing motion estimation is able to be implemented in a number of ways. One implementation includes utilizing phase correlation. Phase correlation uses a frequency-domain approach to estimate the relative translative offset between two similar images.
There are a number of conventional methods to determine sub-pel precision from the phase correlation surface. These sub-pel methods can generally be categorized as 1-D methods or 2-D methods. A 1-D method operates on each spatial dimension independently. In other words, the vertical and horizontal sub-pel components of motion are determined separately.
FIG. 1 illustrates notation that is used by the different sub-pel methods. The entries aij are simply the values of s[x,y] (the phase correlation surface) in the neighborhood of the location of the peak in the phase correlation surface. The peak value is a22, and aij=s[xk+j−2,yk+i−2], where (xk,yk) is the location of the peak. Note that due to properties of the FFT, evaluation of indices to the phase correlation surface s[x,y] is performed modulo N. The 2D sub-pel methods are able to use some or all of the correlation values in the figure. Some 2D methods use even larger windows, which are able to make use of correlation values that extend beyond the 3×3 neighborhood demonstrated.
The 1-D sub-pel methods consider horizontal and vertical sub-pel components independently, and use the correlation values shown in FIG. 2.
It is shown by H. Foroosh et al. in “Extension of Phase Correlation to Subpixel Registration” that the phase correlation surface in the presence of translational motion is very well approximated by a sin c function. Derivations in Foroosh et al. lead to a relatively simple 1-D sub-pel method that operates in each spatial direction independently. The method is applied to the neighborhood near the phase correlation peak.
In “Television Motion Measurement for DATV and Other Applications” by G. A. Thomas, a 1-D quadratic function is fit to the three points in the neighborhood of the phase correlation peak (either the horizontal or vertical values shown in FIG. 2). In “Practical Approach to the Registration of Multiple Frames of Video Images” by I. E. Abdou, a 1-D Gaussian function is fit in a similar fashion. Results for the methods of Thomas and Abdou are marginal, because as shown in Foroosh et al., the phase correlation surface is neither quadratic nor Gaussian, and as such the methods are limited because of inappropriate fitting functions. Also, in many cases the 1-D sub-pel methods do not provide as complete of a fit in the peak neighborhood as is possible when using the 2-D sub-pel methods.
In “A Study of Sub-Pixel Motion Estimation Using Phase Correlation” by V. Argyriou et al., the following modified sin c function is considered:
      h    ⁡          (      x      )        =            exp      ⁡              (                  -                      x            2                          )              ⁢                            sin          ⁡                      (                          π              ⁢                                                          ⁢              x                        )                                    π          ⁢                                          ⁢          x                    .      It is then determined the three parameters A, B, and C that best fit the function A·h(B[x−C]) to the observed phase correlation surface, in a least squares sense. Determining such a fit is complicated, since there is no closed-form solution. It thus requires numerical solution, which can be computationally demanding. Note that this method is also a 1-D sub-pel method, so that it shares the limitation mentioned previously for 1-D sub-pel methods compared to 2-D sub-pel methods.
In “High-Accuracy Subpixel Image Registration Based on Phase-Only Correlation” by Takita et al., it is proposed to fit a 2-D Gaussian function to the phase correlation surface. First, a frequency-domain Gaussian pre-filter (applied to S[m,n], the phase correlation surface in the Fourier domain) is used to smooth the phase correlation surface. Second, least squares is used to fit the 7×7 neighborhood of the correlation peak to a Gaussian function. The large window size combined with a least-squares optimization for the complicated Gaussian function can lead to an overly complex algorithm.
Finally, in “Robust Motion Estimation for Video Sequences Based on Phase-Only Correlation” by L. H. Chien et al., it is proposed to fit the following 2-D function to the phase correlation surface near the correlation peak:
      h    ⁡          (              x        ,        y            )        =            α              N        2              ⁢                            sin          ⁡                      [                          π              ⁡                              (                                  x                  +                                      Δ                    ⁢                                                                                  ⁢                    x                                                  )                                      ]                          ⁢                  sin          ⁡                      [                          π              ⁡                              (                                  y                  +                                      Δ                    ⁢                                                                                  ⁢                    y                                                  )                                      ]                                                sin          ⁡                      [                                          π                N                            ⁢                              (                                  x                  +                                      Δ                    ⁢                                                                                  ⁢                    x                                                  )                                      ]                          ⁢                  sin          ⁡                      [                                          π                N                            ⁢                              (                                  y                  +                                      Δ                    ⁢                                                                                  ⁢                    y                                                  )                                      ]                              
An unspecified frequency-domain pre-filter is used to smooth the phase correlation surface. An unspecified size for the fitting window is also used, although it appears from a figure that the window size may be 7×7. The complicated nature of the equation h(x,y) leads to a computationally demanding least-squares solution for the Δx and Δy, and for α which must be estimated as part of the solution.
The methods described all perform sub-pixel estimation based on the neighborhood of the phase correlation peak, which uses the s[x,y] values introduced previously. Alternative configurations exist that work in the Fourier domain directly on the S[m,n] values. One such method is that of “Subspace Identification Extension to the Phase Correlation Method” by Hoge. In Hoge, a singular value decomposition of the N×N array S[m,n] to form a rank-1 approximation of the phase surface S[m,n] is performed. The resulting length-N vectors from the rank-1 approximation are then processed separately to give the horizontal and vertical motion. This method avoids the need for the IFFT and peak finding but requires other complicated procedures in their place: singular value decomposition, phase unwrapping and least-squares line fitting.