The present invention relates to the testing of video signals, and more particularly to a joint spatial-temporal alignment of video sequences that aligns a test video signal with a reference video signal for testing video fidelity.
In video processing and/or transmission systems temporal latency and spatial shift may be introduced into the video signal. An example is a video transmission system using MPEG-2 compression for bandwidth reduction and digital bit stream transport. In order to test video signal fidelity of a video processing system, it is necessary to accurately align the signal under test to that of an original reference signal both spatially and temporally. The alignment task includes spatial and temporal shift detection and correction. A video sequence has sampled images equally spaced in time. For interlaced video each sampled image is called a video field.
There are many alignment methods for two-dimensional images, such as the well-known phase correlation method for spatial shift detection, as discussed in "The Phase Correlation Image Alignment Method" by C. D. Kuglin and D. C. Hines, Jr. in the Proceedings of the IEEE 1975 International Conference on Cybernetics and Society, September 1975, pp. 163-165. The phase correlation method is based upon the fact that most of the information about the alignment between two images is contained in the phase of their cross-power spectrum. The discrete phase correlation function is obtained by first computing the discrete two-dimensional Fourier transforms, F1 and F2, of two sampled images, calculating the cross-power spectrum and extracting its phase for each frequency bin. The phase array is calculated by multiplying Fourier transform F1 and the conjugate of F2, and dividing by the magnitude of the product. By performing inverse Fourier transform of the phase array, a phase correlation surface is obtained.
The location of the peak of the phase correlation surface provides information about the amount of the spatial shift between the two images: the height of the peak corresponds to the similarity of the two images. As an example of an ideal case, an image is shifted by a vector translation S, and equation (2) from the Kuglin et al article yields a unit height delta function centered at the location S of the correlation surface. For fractional pixel shift some form of curve fitting around the peak may be used to refine the peak location to fractional pixel precision.
The phase correlation alignment method provides a basic tool for solving the problem of spatial-temporal alignment of video sequences. To detect spatial-temporal shift of the video sequences relative to each other, the ambiguities of spatial shift or temporal shift uncertainties that exist in certain video signals need to be considered. For example temporal shift may not be determined with certainty for static scenes. For video captured by perfect camera panning, the shift may be caused by either temporal or spatial shift. The ambiguity may be resolved only when significant spatial-temporal changes occur in the video sequences.
What is desired is a joint spatial-temporal alignment of video sequences that resolves in a robust way the inherent ambiguities possible in the video sequences.