A digital camera is a component often included in commercial electronic media device platforms. Digital cameras are now available in wearable form factors (e.g., video capture earpieces, video capture headsets, video capture eyeglasses, etc.), as well as embedded within smartphones, tablet computers, and notebook computers, etc.
Often, a digital camera user wishes to fill in a region of an image, for example to remove a foreground object from a scene, after the image is captured. Image inpainting is a technique used to fill regions in digital images and may be used to remove unwanted objects. From a captured image, a user can specify a target, or destination, region to be filled. The target region is automatically replaced with hallucinated image contents that look plausible and combine naturally with retained parts of the image scene. In one conventional approach to image filling illustrated in FIG. 1, image patches sourced from background region 103 of image 120 captured by digital camera 110 are combined to fill in target region 105 and replace foreground object 102. Image patches are identified as similar exemplars and transferred to the boundaries of the target region. This process is repeated along the target contour until the target region is completely filled. However, when only a single image viewpoint is employed, inpainting of relatively large portions in a scene can result in an output image 190 that suffers significant visual artifacts 195.
Advanced mobile devices with multiple cameras embedded in the same device are now becoming commercially available. For such a platform, multiple images may be captured from different viewpoints of a scene at one instant in time. Some conventional stereo inpainting techniques have employed depth information derived from disparity between images collected from stereo cameras. These techniques however have thus far proven to be too computationally intensive for ultra light and low-power mobile platforms.
Computationally inexpensive automated image filling techniques capable of reducing visual artifacts by leveraging a richer set of input information available through multiple image viewpoints are therefore highly advantageous.