An algorithm for super-resolution enhances the spatial resolution (or detail) of the images created by an imaging system. Typically, such an algorithm works by fusing together several low-resolution (LR) images to reconstruct a single high-resolution (HR) or super-resolution (SR) image or sequence of images. Resolution enhancement factor is the basic parameter of the super-resolution algorithm, denoting the ratio of the number of pixels in the HR and the LR image. For example, pixel grids for LR and HR image with a resolution enhancement factor of 4 are illustrated in FIG. 1, where fi is the LR image, Fi is the HR image, and the stars in Fi represent the new pixels resulting from the enhancement and are located on the half-pixel grid of fi.
The basic premise for enhancing spatial resolution in an SR algorithm is the availability of multiple LR images covering the same scene. Two main assumptions used in any super-resolution process are: (1) LR images are generated from underlying continuous image with aliasing (i.e., they carry information about spatial frequencies higher than the Nyquist frequency of LR image); and (2) there is a relative shift between LR images. When the motion between the LR images is complex and not aligned with the full-pixel grid (which is true for virtually all captured videos) and when aliasing is present, each LR image carries some new information that can be used to generate an HR image. Suitable LR images can be obtained from a single camera (assuming there is a relative motion between the frames of video sequence) or from multiple cameras located in different positions. The relative motion between LR frames can occur as a result of the controlled motion in imaging systems, e.g., images acquired from orbiting satellites. This motion can also be the result of uncontrolled motion in the scene, e.g., movement of local objects or vibrating imaging systems. If the scene motion is known or can be estimated within subpixel accuracy, SR image reconstruction is possible.
In the non-uniform interpolation approach to super-resolution, SR image reconstruction usually consists of three steps: (a) registration or the estimation of relative motion (if the motion information is not known); (b) non-uniform interpolation of color intensities producing an improved resolution image, and (c) restoration, which often involves a deblurring process that depends on the observation model. These steps can be implemented separately or simultaneously according to the reconstruction methods adopted. In the registration step, the relative motion between LR images is estimated with fractional pixel accuracy. Accurate subpixel motion estimation is an important factor in the success of the SR image reconstruction algorithm. Since the motion between LR images is arbitrary, the registered data from LR images (i.e., warped onto the reference coordinate system of an HR frame) will not always match up to a uniformly spaced HR grid. Consequently, non-uniform interpolation of color intensities is used to obtain a uniformly spaced HR image from a non-uniformly spaced composite of LR images. Finally, image restoration is applied to the up-sampled image to remove blurring and noise. With regard to all of the foregoing, see Sung Cheol Park, Min Kyu Park, and Moon Gi Kang, Super-Resolution Image Reconstruction: A Technical Overview, IEEE Signal Processing Magazine (May, 2003), pp. 21-36.
As just noted, the first step in a typical super-resolution algorithm is registration. Image registration is the process of estimating a mapping between two or more images of the same scene taken at different times, from different viewpoints, and/or by different sensors. It geometrically aligns two images—the reference image and the so-called “matching” image.
Generally, there are two categories of image differences that need to be registered. Differences in the first category are due to changes in camera position and pose. These sorts of changes cause the images to be spatially misaligned, i.e., the images have relative translation, rotation, scale, and other geometric transformations in relation to each other. This category of difference is sometimes referred to as global transformation or global camera motion (GCM).
The second category of differences cannot be modeled by a parametric spatial transform alone. This category of differences can be attributed to factors such as object movements, scene changes, lighting changes, using different types of sensors, or using similar sensors but with different sensor parameters. This second category of differences is sometimes referred to as independent object motion or local object motion (LOM). Such differences might not be fully removed by registration due to the fact that LOM rarely conforms to the exact parametric geometrical transform. In addition, the innovation that occurs in video frames in the form of occlusion and newly exposed area can not be described using any predictive model. In general, the more LOM- or innovation-type differences exist, the more difficult it is to achieve accurate registration. See Zhong Zhang and Rick S. Blum, A Hybrid Image Registration Technique for a Digital Camera Image Fusion Application, Information Fusion 2 (2001), pp. 135-149.
Parametric coordinate transformation algorithms for registration assume that objects remain stationary while the camera or the camera lens moves; this includes transformations such as pan, rotation, tilt, and zoom. If a video sequence contains a global transformation between frames, the estimated motion field can be highly accurate due to the large ratio of observed image pixels to unknown motion model parameters. A parametric model which is sometimes used to estimate the global transformation that occurs in the real world is the eight-parameter projective model, which can precisely describe camera motion in terms of translation, rotation, zoom, and tilt. To estimate independent object motion, Horn-Schunck optical flow estimation is often used though it often requires a large number of iterations for convergence. See Richard Schultz, Li Meng, and Robert L. Stevenson, Subpixel Motion Estimation for Multiframe Resolution Enhancement, Proceedings of the SPIE (International Society for Optical Engineering), Vol. 3024 (1997), pp. 1317-1328, as to the foregoing and the details of the eight-parameter projective model.
Once the relative motion has been estimated in the registration phase, one obtains an HR image on non-uniformly spaced sampling points by the process sometimes referred to as “shift-and-add”. The analysis of an irregularly spaced data series is more complicated than that of regularly spaced data series. More importantly, practically all modern systems for image storage and display use a regular grid for image representation. Consequently, it is necessary to re-sample a given irregularly sampled data series onto a regular grid. This re-sampling typically requires some form of interpolation or, in the presence of noise, reconstruction (effectively assuming certain properties of an “underlying” continuous function) of color intensities. Overall, the goal of interpolation/estimation is to provide the highest possible image fidelity at the output resolution. See H.-M. Adorf, Interpolation of Irregularly Sampled Data Series-A Survey, in Astronomical Data Analysis Software and Systems IV, ASP Conference Series, Vol. 77, 1995. This non-uniform interpolation is sometimes referred to as irregular-to-regular interpolation and it is different from the regular-to-irregular interpolation described below. As noted earlier, non-uniform interpolation is usually the second step of a classical super-resolution algorithm.
A related problem to SR techniques is image restoration. The goal of image restoration is to recover the original image from a degraded (e.g., blurred, noisy) image. Image restoration and SR reconstruction are closely related theoretically and SR reconstruction can be considered a second-generation problem of image restoration. As noted earlier, image restoration is usually the third step of a non-uniform interpolation algorithm for super-resolution, though it might be performed as a standalone process to remove blocks and/or quantization artifacts, for example.
Super-resolution and image restoration are useful in many applications, including, for example, the enhancement of the LR video that is produced by digital cameras in mobile telephones. The enhanced videos might then be displayed on a computer or distributed on an Internet video site such as MySpace or YouTube.