One of the important challenges in computer vision applications is acquiring high resolution depth maps of scenes. A number of common tasks, such as object reconstruction, robotic navigation, and automotive driver assistance can be significantly improved by complementing intensity data from optical cameras with high resolution depth maps. However, with current sensor technology, direct acquisition of high-resolution depth maps is very expensive.
The cost and limited availability of such sensors imposes significant constraints on the capabilities of computer vision systems and hinders the adoption of methods that rely on high-resolution depth maps. Thus, a number of methods provide numerical alternatives to increase the spatial resolution of the measured depth data.
One of the most popular and widely used class of techniques for improving the spatial resolution of depth data is guided depth superresolution. Those techniques jointly acquire depth maps of the scene using a low-resolution depth sensor, and optical images using a high-resolution optical camera. The data acquired by the camera is subsequently used to superresolve a low-resolution depth map. Those techniques exploit the property that both modalities share common features, such as edges and joint texture changes. Thus, such features in the optical camera data provide information and guidance that significantly enhances the superresolved depth map.
In the past, most of those methods operated on a single optical image and low-resolution depth map. However, for most practical uses of such methods and systems, one usually acquires a video with the optical camera and a sequence of snapshots of the depth maps.
One approach models the co-occurence of edges in depth and intensity with Markov Random Fields (MRF). An alternative approach is based on joint bilaterial filtering, where intensity is used to set weights of a filter. The bilaterial filtering can be refined by incorporating local statistics of depths. In another approach, geodesic distances are used for determining the weights. That approach can be extended to dynamic sequences to compensate for different data rates in the depth and intensity sensors. A guided image filtering approach can further improve edge preservation.
More recently, sparsity-promoting regularization, which is an essential component of compressive sensing, has provided more dramatic improvements in the quality of depth superresolution. For example, improvements have been demonstrated by combining dictionary learning and sparse coding methods. Another method relies on weighted total generalized variation (TGV) regularization for imposing a piecewise polynomial structure on depth.
The conventional MRF approach can be combined with an additional term promoting transform domain sparsity of the depth in an analysis form. One method uses the MRF model to jointly segment objects and recover a higher quality depth. Depth superresolution can be performed by taking several snapshots of a static scene from slightly displaced viewpoints and merging the measurements using sparsity of the weighted gradient of the depth.
Many natural images contain repetitions of similar patterns and textures. State-the-art image denoising methods, such as nonlocal means (NLM), and block matching and 3D filtering (BM3D) take advantage of this redundancy by processing the image as a structured collection of patches. The formulation of NLM can be extended to more general inverse problems using specific NLM regularizers. Similarly, a variational approach can be used for general BM3D-based image reconstruction. In the context of guided depth superresolution, NLM has been used for reducing the amount of noise in the estimated depth. Another method combines a block-matching procedure with low-rank constraints for enhancing the resolution of a single depth image.