In numerous applications related to computer vision and computational photography, the topic of image registration plays a central role: in particular, it is used when the need arises to associate the pixels of one image with corresponding pixels (pixels depicting the same point in the captured scene) in another image.
The problem of matching pixels in different images is quite common in computer vision applications based on extracting information from multiple images in order to increase their quality or to detect the geometry of the scene. In this context, typical challenges addressed by image registration are the elimination of perspective differences observed in stereo images, manipulation of dynamic scenes, where motion in the set of captured images can be observed (e.g. two sequential images capturing moving objects), and compensation of the camera movement where sequential images are captured with handheld cameras.
Accordingly, image registration plays an important role in manipulation scenarios which fall under the scope of computational photography, such as high dynamic range imaging (HDRI). HDRI enables to computationally increase the dynamic range of images captured using digital cameras, which are incapable of reproducing the entire dynamic range of the real world scenes, mainly due to hardware limitations. An exemplary approach for HDRI relies on capturing several images (sequentially or simultaneously with stereo/multiple cameras) at different exposures (exposure bracketing), and then merging them into a High Dynamic Range image.
Based on the information gained from the differently-exposed input images, the inverse camera response function (CRF) is estimated, which enables the reconstruction of a scene radiance map (estimation of the original light energy received as input by the camera sensor) with increased dynamic range. The extended dynamic range is then remapped using a tone mapping operation, into a low dynamic range format displayable by common displays. The radiance map is normally computed by a weighted sum of the corresponding pixels from the input images, assigning different weights according to the level of exposure of the original images.
The present disclosure can be applied to any method for CRF estimation, tone mapping and weighting for scene radiance map computation.
In case where two or more images are taken sequentially, and assuming the depicted scene and the camera are static (no motion during the capturing process), the high dynamic range (HDR) computation doesn't present particular problems. This is because it is assumed that the input images are perfectly geometrically aligned, which in turn means that all image content has the same location in all input images. Nevertheless, this is a very uncommon case in practice: typically, during the time needed to capture the input images with different exposure settings, movements of the capturing device or objects in the scene occur. Consequently, the captured images are not perfectly aligned any more. When images are captured through a stereo/multi-camera case, the simultaneous image capturing does not have the problem of moving objects as occurring in sequential capturing. However, the baseline difference between the cameras causes occlusions and perspective distortions that need to be compensated.
In case the motion or disparity compensation is not carried out correctly, the radiance map in certain areas of the image is computed by weighting wrong pixels, thus resulting in severe and evident ghosting artifacts. Areas with distortions caused by dynamic changes or occlusion will also be referred to as “ghosted areas”.
Therefore, the quality of HDRI and other applications based on multiple images of the same scene, depends strongly on the motion (or disparity in the stereo/multi-camera scenario) compensation operation between different images to generate consistent results.
The compensation of motion due to moving cameras or objects in the scene, or disparity of the stereo/multi-camera baseline, represents an important research field with the goal of reducing the artifacts observable in the ghosted areas. This is especially the case for 2-dimensional (2D) (sequential images), where several approaches were developed.
The existing methods for motion and disparity compensation can be classified into two main groups: the first group of motion compensation techniques aims at reducing the visible artifacts in the ghosted areas computing radiance values, in those areas, only on the basis of one image. Accordingly, this requires a detection of ghosted areas.
Recently, a method for detecting misalignments due to camera motion during the capturing process was proposed. The detection of the transitional misalignment starts with applying an exposure-invariant transformation, thus detecting the transition that enables the maximum similarity between the transformed maps.
Based on this method, an additional technique for the detection of transitional and rotational misalignments due to camera motion and blurred regions due to moving objects during the capturing was proposed.
The second group of motion/disparity compensation techniques is based on image registration, so that no prior knowledge of the camera settings (e.g. exposure times, inverse CRF . . . ) is required and no detection of ghosted areas is needed. An exemplary method computes a dense motion field using an energy-based optical flow approach, which is invariant to illumination difference between the input images. The computed motion vectors are used for image registration and therefore for motion compensation. HDR images are computed on the basis of such registered images.
The above described solutions for motion and disparity compensation present several limitations which are mostly related to the performance of the enabling algorithm. For instance, the first group of methods treat ghosted areas differently to the rest of the image (namely using only one image for the HDR computation) results in visible artifacts, that can be reduced with the proposed disclosure.
Similarly, methods which fall under the scope of image registration through motion estimation may decrease the dependency on camera settings, but they rely strongly on the performance of the motion/disparity technique such as optical flow. In addition, the results of the motion estimation step will be inaccurate if no proper preprocessing for the purpose of reducing the luminance difference between the input images is performed.
For example, the European patent application EP 2 395 748 A2 shows a conventional system and method for generating a high dynamic range image. The system and method shown there are disadvantageous, since the resulting HDRI image comprises remaining errors.