In depth cameras or depth images, an image may be produced that contains depth information for a scene such as information related to how deep or how far away objects in the scene are in relation to the camera's viewpoint. Such images may be fundamental building blocks in perceptual computing for applications such as gesture tracking and object recognition, for example. A stereo camera system is one alternative to achieve such depth imaging and may have the advantage of cost effectiveness when compared to other alternatives such as time of flight techniques and structured light techniques.
In stereo camera systems, depth imaging may be obtained by correlating left and right stereoscopic images to match pixels between the stereoscopic images. The pixels may be matched by determining which pixels are the most similar between the left and right images, for example. Such correlation may include aggregating a correlation function over a support window around an individual pixel of, for example, the left image. Information obtained by aggregating the correlation function over the support window may be used for matching the pixel to a corresponding pixel in the other (right) image.
Pixels correlated between the left and right stereoscopic images may be used to determine depth information. For example, a disparity between the location of the pixel in the left image and the location of the pixel in the right image may be used to calculate the depth information using binocular disparity techniques.
The size and shape of the support window may determine the performance of the correlation and/or determination of the depth information for a scene. For example, choosing a small support window may support constructing small to tiny objects in a scene and provide sharper object boundaries. However, the output from small support windows may be noisier, especially around areas of the scene that include little or no information. Choosing large support windows may provide smoother output with area constructions having a smaller signal-to-noise ratio. However, large support windows may blur output and incorrectly connect smaller objects such as the fingers of a hand, for example. Further, large windows may cause depth disparity over the support window such that the window may include objects at different depths, which may cause pixel mismatches and incorrect disparity values.
Current techniques for choosing a support window size include performing color segmentation and fitting the support window into segments of an image having the same color. Such solutions may work well for colorful images, however, they may have degradation problems for images that are less colorful, as is the situation with many real-life images. Further, such techniques may be computationally intensive and may not work with active stereo, in which a pattern is projected on a scene since in such pattern dominant situations, it is difficult to segment objects in a scene based on color.
Since depth imaging may be used in a wide variety of applications, it may be desirable to make depth imaging more accurate and reliable.