Camera arrays are typically passive depth acquisition devices that rely on texture in the scene to estimate depth. In image processing, the term texture or image texture is used to describe spatial arrangement of color or intensities in a region of an image. A region is considered to have texture when there is significant variation in color and/or intensity within the region. A region is said to be textureless when color and/or intensity are uniform or vary gradually. Disparity estimation processes used in multi-baseline stereo systems and camera arrays find correspondences between features visible in a set of images captured by the cameras in the system to determine depth. While this works for scenes with texture, depth estimation can fail in regions of a scene that lack texture due to insufficient features in the scene from which to determine pixel correspondences. Other depth cues can be used to compensate for an inability to recover depth based upon disparity including (but not limited to) shape from shading, depth from defocus, or other photogrammetry cues to determine depth in such flat (i.e. textureless) regions.
In a research report published in May of 1984 by the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology entitled “PRISM: A Practical Real-Time Imaging Stereo Matcher” by Nishihara (A.I. Memo 780), a process for determining depth using binocular stereo in which a scene is illuminated with an unstructured texture pattern by a projector is disclosed. The illumination is intended to provide suitable matching targets on surfaces in which surface contrast is low compared with sensor noise and other inter-image distortions. The disclosed process illuminates the scene with a random pattern and the depth estimation process assumes no a priori knowledge of the illumination pattern.
Following the publication of the research report by the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology, a number of research groups have observed that use of random projected patterns with binocular stereo cameras can lead to regions of depth ambiguity due to the projected pattern being too self-similar in specific regions of the projected pattern. Accordingly, alternative projection patterns have been proposed to avoid self-similar regions. J. Lim, “Optimized projection pattern supplementing stereo systems,” in ICRA, 2009 proposes utilizing patterns generated using De Bruijn sequences and K. Klonige, “Projected Texture Stereo,” in ICRA, 2010 proposes utilizing patterns generated based upon Hamming codes.