In active depth sensing, a projector projects patterns of light such as infrared (IR) dots to illuminate a region being sensed. The projected patterns are captured by a camera/sensor (two or more in stereo systems), with the image (or images) processed to compute a depth map or the like, e.g., per frame. Infrared is advantageous because color (RGB) images results in very noisy depth values.
In stereo systems, stereo cameras capture two images from different viewpoints. Then, for example, one way to perform depth estimation with a stereo pair of images is to find correspondences between the images, e.g., to correlate each projected and sensed IR dot in one image with a counterpart IR dot in the other image. Once matched, the projected patterns within the images may be correlated with one another, and disparities between one or more features of the correlated dots used to estimate a depth to that particular dot pair. For example, a dense depth map at the original (native) camera resolution may be obtained by area matching (e.g., via a window of size 5×5).
However, not all surfaces reflect IR light particularly well. As a result, in any part of an image that corresponds to a poorly reflective IR surface, there are generally not enough IR data (e.g., reflected dots) in the stereo images to correlate with one another, and thus no depth data or very sparse depth data. This is problematic even with a single two-dimensional depth map; in point cloud applications, such as those that use depth data to construct a mesh, the lack of adequate depth data in certain regions may be even more pronounced.