Multi-frame super-resolution algorithms have been a focus of image processing research for several decades. These algorithms typically seek to produce a high-resolution image by combining sampling limited low-resolution images of the same scene collected over a short time interval. Classical approaches typically rely on accurate sub-pixel motion estimation between the low-resolution images, which can be a difficult problem for arbitrary motion patterns. As a result, successful super-resolution performance was initially demonstrated only in cases of simple global motion, e.g., uniform translations or rotations, and then in more general cases where the motion was described using a global affine motion models. More recently, some progress has been achieved in cases containing relatively simple local motion fields with one or two objects moving through a scene, in a straightforward manner, with small relative displacements accumulated through the whole sequence.
Modern optical flow (OF) algorithms can be capable of estimating such motion fields with the required sub-pixel accuracy. However, these model scenes are rarely representative of real life situations. The more complex motion patterns present in real life image sequences typically cause OF algorithms to experience significant problems, invalidating their underlying models. Uncertainty of motion estimation can be modeled using a Bayesian approach leading to significant improvement in super-resolution. However, the errors arising from estimation of the most complex scene movements still produce distortions in the final high-resolution images.
Therefore, there is still a need in the art for super-resolving images containing complex motion patterns that present a significant challenge to motion estimation techniques despite the recent progress of OF algorithms. The most prevalent modern approach to motion estimation is to pose the problem in a variational form with a global smoothness constraint. Some recent OF algorithms can accurately compute motion fields containing irregularities, e.g., discontinuities, occlusions, and brightness constancy violations. However, estimation of large displacements remains difficult because the solution that is obtained as a result of local optimization is biased towards initialization, which is usually a zero motion field. The coarse-to-fine approach adopted by the modern OF algorithms can somewhat alleviate this problem by first computing an estimate on the coarser scales and then refining this estimate at finer scales. However, these algorithms tend to bias the motion of finer features towards the motion of the larger scale structures. Thus, the motion patterns in which small structures move in a different way from larger scale structures, and motion patterns where the relative motion of small scale structures are larger than their own scale represent the most difficult problem for modern OF algorithms.
One category of image sequence where such motion arises is human motion. This is because relatively small body parts such as hands or legs move extremely fast relative to their own size. The most recent attempt to resolve nonconforming motion of different scale structures has been made by the addition of local descriptors such as SIFT and HOG features to the variational OF formulation. Unfortunately, the interpolation error of such algorithms does not experience dramatic improvement compared to optical flow algorithms that do not use descriptors. In addition, rich features generally rely on large pixel counts, which diminish the applicability of these optical flow algorithms to the low resolution images used in super-resolution processing.
Within the context of super-resolution, failure to estimate local motion details leads to lack of resolution, spurious image distortions, and reduced dynamic range of the resulting image. Super-resolution techniques that use implicit motion estimation via block matching have been developed recently and are generally free of the motion induced image processing artifacts inherent to classical algorithms. They are able to provide image resolution enhancement to real life video sequences. However, the demonstrated resolution enhancement factor of nonlocal methods has generally been modest. Additionally, nonlocal techniques experience block matching difficulties with large displacements, rotational motion, and blurred edges.
Accordingly, there remains a need in the art for a super-resolution algorithm that addresses the problems of motion estimation errors.