Image registration is the process of determining the correspondence between pixel elements in a pair of images that have common subject matter. In particular, image registration involves determining the parameters of a transformation that relates the pixel elements in the pair of images. Image registration is therefore an important aspect of image matching, where two images are compared for common subject matter under the assumption that some geometrical transformation exists that relates substantial portions of the two images. Image registration is also used in satellite and medical imagery where a series of partially overlapping images are obtained, and a mosaic of the individual images has to be formed to thereby form a single large image.
Image registration is also useful in camera calibration, where an image of a known object is captured and the location of that known object within the image is calculated to determine some unknown parameters of the imaging device. Yet another application of image registration is as part of a system for determining the relative video camera orientation and position between video frames in a video sequence.
A simple form of image registration may be used when the two images are related through translation only. In such a case, a matched filtering method may be used to find the translation relating the two images. Cross-correlation and phase-correlation are examples of two such matched filtering methods.
Cross-correlation may be performed in the spatial or frequency domain. Consider two images, I1(x, y), and I2(x, y) that are functions of pixel coordinates x and y. The cross-correlation of the two images I1(x, y), and I2(x, y) in the spatial domain is defined by:
                              C          ⁡                      (                                          x                ′                            ,                              y                ′                                      )                          =                              ∑            x                    ⁢                                    ∑              y                        ⁢                                                            I                  1                                ⁡                                  (                                      x                    ,                    y                                    )                                            ⁢                                                I                  2                                ⁡                                  (                                                            x                      +                                              x                        ′                                                              ,                                          y                      +                                              y                        ′                                                                              )                                                                                        (        1        )            
If the two images I1(x, y), and I2(x, y) are related by a simple translation (Δx,Δy) whereby:I2(x, y)=I1(x−Δx, y−Δy),  (2)
then the cross-correlation C(x′, y′) has a maximum value at the translation coordinates (Δx,Δy) where:
                              C          ⁡                      (                                          Δ                x                            ,                              Δ                y                                      )                          =                              ∑            x                    ⁢                                    ∑              y                        ⁢                                                            I                  1                                ⁡                                  (                                      x                    ,                    y                                    )                                            2                                                          (        3        )            
Thus, by calculating the cross-correlation C(x′, y′) of the two images I1(x, y), and I2(x, y), the translation (Δx,Δy) that registers the two images I1(x, y), and I2(x, y) may be determined.
Cross-correlation is generally performed using the Fast Fourier Transform (FFT). For an image In(x, y), the discrete Fourier transform ℑ(I) is defined by:
                                                        𝔍              ⁡                              [                                  I                  n                                ]                                      ⁢                          (                              u                ,                v                            )                                =                                    ∑                              x                =                0                                            N                x                                      ⁢                                          ∑                                  y                  =                  0                                                  N                  y                                            ⁢                                                                    I                    n                                    ⁡                                      (                                          x                      ,                      y                                        )                                                  ⁢                                  ⅇ                                                            -                      2                                        ⁢                    πⅈ                    ⁢                                                                                  ⁢                                          xu                      /                                              N                        x                                                                                            ⁢                                  ⅇ                                                            -                      2                                        ⁢                    πⅈ                    ⁢                                                                                  ⁢                                          yv                      /                                              N                        y                                                                                                                                ,                            (        4        )            
where Nx and Ny are the image dimensions in the x and y dimensions respectively. The inverse discrete Fourier transform ℑ−1(F) is defined by:
                                          𝔍                          -              1                                ⁡                      (            F            )                          =                              1                                          N                x                            ⁢                              N                y                                              ⁢                                    ∑                              u                =                0                                            N                x                                      ⁢                                          ∑                                  v                  =                  0                                                  N                  y                                            ⁢                                                F                  ⁡                                      [                    I                    ]                                                  ⁢                                  (                                      u                    ,                    v                                    )                                ⁢                                  ⅇ                                      2                    ⁢                    πⅈ                    ⁢                                                                                  ⁢                                          xu                      /                                              N                        x                                                                                            ⁢                                                      ⅇ                                          2                      ⁢                      πⅈ                      ⁢                                                                                          ⁢                                              yv                        /                                                  N                          y                                                                                                      .                                                                                        (        5        )            
The FFT is a computationally efficient method of calculating the Discrete Fourier Transform ℑ(I) and its inverse ℑ−1(F).
The cross-correlation C may be calculated through:C=ℑ−1(ℑ(I1)ℑ(I2)*),  (6)
where ℑ(I2)* denotes the complex conjugation of the Discrete Fourier Transform ℑ(I2). Thus, taking the inverse FFT ℑ−1 ( ) of the product of the FFT of one image ℑ(I1) and the complex conjugate of the FFT of the other image ℑ(I2)* leads to a further image which contains the values of the cross-correlation C which is equivalent to that defined in Equation (1).
Phase-correlation C′ is another matched filtering method that is often used and is defined by:
                              C          ′                =                              𝔍                          -              1                                ⁡                      (                                                            𝔍                  ⁡                                      (                                          I                      1                                        )                                                                                                          𝔍                    ⁡                                          (                                              I                        1                                            )                                                                                                    ⁢                                                                    𝔍                    ⁡                                          (                                              I                        2                                            )                                                        *                                                                                        𝔍                    ⁡                                          (                                              I                        2                                            )                                                                                                              )                                              (        7        )            
That is, rather than using the discrete Fourier transforms ℑ(I) of the images I1(x, y), and I2(x, y), only the complex phase part of the discrete Fourier transforms of the images are used. If the images I1(x, y), and I2(x, y) are related by a translational offset (Δx,Δy), the phase correlation C′ will have a very sharp peak at the translation coordinates (Δx,Δy) that relates the two images I1(x, y), and I2(x, y), and small values elsewhere in the phase correlation C′.
Whereas matched filtering is used when the two images I1(x, y), and I2(x, y) are related through translation only, when the two images I1(x, y), and I2(x, y) are related by a rotation and a scale transformation, such that image I2(x, y) is a rotated and scaled version of image I1(x, y), i.e.I2(x, y)=I1(s(x cos θ+y sin θ), s(−x sin θ+y cos θ)),   (8)
wherein s is a scale factor and θ is a rotation angle, the unknown rotation θ and scale s parameters may be determined by transforming the images I1(x, y), and I2(x, y) into a log-polar coordinate space through:
                              ρ          =                                    1              2                        ⁢                          log              ⁡                              (                                                      x                    2                                    +                                      y                    2                                                  )                                                    ⁢                                  ⁢                  ϕ          =                                    tan                              -                1                                      ⁢                          y              x                                                          (        9        )            
The translation above leads to a relationship between the images I1(x, y), and I2(x, y) in the log-polar space as follows:I2(ρ,φ)=I1(ρ+log s,φ+θ),  (10)
The matched filtering methods described above may then be used to determine the scale and rotation parameters s and θ from the peak in the correlation C at coordinate (log s, θ).
Image registration is also often applied to a pair of images I1(x, y) and I2(x, y) where the correspondence between pixel elements is not a simple transformation, such as a translation, or rotation and scale. It may be necessary to register two images I1(x, y) and I2(x, y) that are captured using different imaging devices, or by the same imaging device but where each image is captured using a different configuration of the imaging device. In such cases the transformation includes translation, rotation and scaling.
Consider two images I1(x, y), and I2(x, y) that are related by a translation as well as a rotation and a scale, such that:I2(x, y)=I1(s(x cos θ+y sin θ)+Δx, s(−x sin θ+y cos θ)+Δy)  (11)
Current methods of registering images related by translation, rotation and scale, suffer from poor signal to noise ratios over a large class of images when the transformation parameters are such that the overlap between the two images is significantly reduced. Furthermore, current methods contain a 180-degree ambiguity, which leads to computational inefficiencies. This ambiguity is the result of the fact that the Fourier magnitude of a real function is symmetric.