Depth images can be used for 3D scene analysis and other 3D computer vision applications. Typically, the resolution of the depth images is substantially less than comparable optical images, i.e., gray scale or color images. The resolution of images acquired by optical cameras can easily be ten megapixels or greater, while the resolution of depth images is typically less than 0.02 megapixels. As defined herein, pixels in optical images only have associated intensities, perhaps in multiple color channels. A depth image only has associated depths for the pixels.
One way to increase the resolution of the depth images is to use a high resolution optical image that is registered to a low resolution depth image. In general, depth and color boundaries of a scene are correlated. Abrupt depth transition often leads to abrupt color transition.
Depth upsampling can be global or local. Global methods formulate depth upsampling as an optimization problem. A large cost is induced when neighboring pixels having similar color intensities are assigned very different depths, which encourages that upsampled depth boundaries coincide with color boundaries. Those methods can produce accurate depth images but are generally too slow for real-time applications. Local methods, based on filtering operations, can be performed in real-time. Joint bilateral upsampling is a local method that uses bilateral filtering in a joint optical and spatial space.
Geodesic distances have been used in various applications, such as colorization, image matting, and image de-noising. A fast marching procedure and a geodesic distance transform are two commonly used implementations for the distance determination. Both have a linear time complexity. However those techniques are too slow to determine all shortest path pairs for geodesic upsampling in real-time.
U.S. Pat. No. 7,889,949, “Joint bilateral upsampling,” issued to Cohen et al. on Feb. 15, 2011, uses a high-resolution input signal having a first resolution, and a low-resolution solution set having a second resolution computed from a downsampled version of the high-resolution signal. A joint bilateral upsampling, using the low-resolution solution set and the high-resolution signal, is performed to generate a high-resolution solution set having a resolution equivalent to the first resolution. The method described therein uses bilateral upsampling. Bilateral upsampling uses separate color and spatial kernels which causes blurry depth boundaries and depth bleeding artifacts particularly when the colors of the surfaces across the depth boundaries are similar. In addition, fine scene structures are not preserved during upsampling.
Criminisi et al., in “Geodesic image and video editing,” ACM Transaction on Graphics, volume 29, issue 5, October 2010, describe image editing tasks, n-dimensional segmentation, edge-aware denoising, texture flattening, cartooning, soft selection, and panoramic stitching. The editing uses a symmetric operator based on the generalized geodesic distance to process high resolutions color images without the need for depth images or spatial upsampling.