Low-end imaging devices such as web-cams and mobile phones often record low-resolution images or videos. With the recent proliferation of high-resolution displays such as high-definition televisions (HDTVs), users often want to display video recorded on a low-end imaging device on displays that have a higher resolution. Reconstructing high-resolution frames from a sequence of low-resolution frames in a video sequence may be accomplished using techniques generally classified as multi-frame super resolution. In a simple example of super resolution, multiple low-resolution frames may represent the same scene with the only difference between the frames being small linear movements or translations between the pixels in adjacent frames. Super resolution infers a high-resolution frame through the realization that small pixel shifts between frames result in a sequence of low-resolution frames with slightly different sample values. Taking into account the different sample values in a set of low-resolution frames enables the extraction of higher-frequency details of the scene resulting in a super-resolved high-resolution frame.
The process of accurately restoring a high resolution frame from a sequence of low-resolution frames becomes more complex when the imaged scene in the input sequence of low-resolution frames changes, as is often the case with recorded video sequences. Furthermore, factors related to the recording process itself may contribute to image degradation, and failure to take these factors into consideration when super-resolving a set of low-resolution images may yield results that are blurry, aliased, or otherwise distorted. For example, the recording lens and/or the atmospheric environment between the recording device and an imaged object may introduce blur into a recorded image and quantization imposed by the sensor array in the recording device may introduce noise into the recorded image. Other sources of degradation from the recording process may include distortion and aliasing effects.
Initially, super-resolution work focused primarily on dealing with the ill-posed nature of reconstructing a high-resolution frame from a sequence of low-resolution frames. Most approaches addressed the lack of constraints in this ill-posed problem by making strict assumptions about the input video sequence including assumptions related to the image degradation factors described above and/or constraints on reconstructing the high-resolution frame. For example, some approaches have employed spatial priors when reconstructing the high-resolution frame. Other approaches have jointly estimated the translational pixel motion and the high-resolution frame and/or considered motion blur using a simple affine motion model. To better approximate the complex motion of faces, some super resolution techniques used more complex motion models designed to capture additional motion details.