A widely used technique for estimating depth values in structured-light three-dimensional (3D) camera systems, also referred to as stereo-camera systems, is by searching for the best match of a patch in the image to a patch in a reference pattern. To reduce the overall computational burden of such a search, the image patch is assumed to be in a near horizontal neighborhood of the reference pattern. Also, the reference pattern is designed so that there is only a finite set of unique sub-patterns, which are repeated horizontally and vertically to fill in the entire projection space, which further simplifies the search process. The known arrangement of the unique patterns in the reference pattern is used to identify the “class” of an image patch and, in turn, determine the disparity between the image patch and the reference patch. The image patch is also assumed to be centered at a depth pixel location, which also simplifies the calculation of the depth estimation.
Nevertheless, if the image patch size and the searching range become large, patch searching becomes time consuming and computationally intensive, thereby making real time depth estimation difficult to achieve. In addition to suffering from significant computational costs, some structured-light 3D-camera systems may also suffer from significant noise in depth estimation. As a consequence, such structured-light 3D-camera systems have high power consumption, and may be sensitive to image flaws, such as pixel noise, blur, distortion and saturation.