Static images obtained through use of digital still cameras and more recently, cellular telephones equipped with digital cameras, are typically far superior in quality as compared with static images derived from video frames captured through use of digital video recorders. Video frames are typically at relatively lower resolutions as compared with still frames, which results in the relatively lower quality images.
Various signal processing techniques have been applied to improve the quality of static images obtained from video frames. One of these techniques for enhancing resolution in the static images includes single frame-based interpolation. However, conventional interpolation techniques typically result in visual degradation because single frame-based interpolation does not add additional information other than what is contained in the single frame.
Another conventional technique for enhancing resolution is the super-resolution (SR) technique. Under this technique, information from multiple successive frames for the same scene is combined to improve spatial resolution. If sub-pixel displacements have occurred among multiple frames, additional information is available. As such, the subsampled low resolution (LR) frames are combinable to synthesize an image with relatively higher resolution (HR). Generally speaking, SR techniques include two main steps: registering LR images with subpixel accuracy and mapping them to the HR grid and synthesizing the HR target image.
Conventional SR techniques range from direct non-uniform interpolation to Iterative Back Projection (IBP), Projection Onto Convex Sets (POCS), Maximum A Posteriori (MAP), as well as other approaches. Each of these techniques has its own assumptions and hence is restricted in different kinds of imaging environments. For example, MAP techniques get better results where there are suitable prior knowledge, such as, face image SR; iterative techniques fit image area with small registration error, otherwise error could be accumulated during iteration.
Most conventional SR techniques assume that registration is known or can be calculated accurately; however, the accuracy of the registration is of great importance in successfully performing SR tasks. It is generally known that accurate subpixel registration is not always possible due to its ill-posedness, the aperture problem, and the presence of covered and uncovered regions in images.
As such, given several auxiliary frames and a target image, typically, some regions in the target image are registered well, while other regions, such as those regions with relatively complex motion or occlusion, are registered poorly. Well registered auxiliary information generally improves resolution in the target image, while poorly registered auxiliary information is known to degrade the quality of the target image beyond that of the original image. In one regard, therefore, existing SR techniques are often prone to fail when facing a scene with complex motion or occlusion, which often occurs in videos.
A large number of algorithms have been investigated for improving robustness of SR techniques to registration error, such as confidential map, joint estimation of motion vector and HR image, L1 norm replacing L2 norm to reduce effects of outliers, learning the missing high-frequency components of image blocks from training samples, as well as other approaches. Recently, D. Barreto et al., “Region-Based Super-Resolution for Compression, in Multidimensional Systems and Signal Processing”, 2007, vol. 18, pp 59-81, proposed to integrate SR techniques into compressing tasks in which they segment blocks of an IBP group into three types before downsampling and encoding procedure. However, the method disclosed therein is designed for compressing video sequence with better quality.
An improved approach to enhancing resolution under the SR technique would therefore be beneficial.