The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also correspond to embodiments of the claimed subject matter.
Conventional cameras capture a single image from a single optical focal point and are enabled to capture pixels corresponding to an object in a scene, but in so doing, such cameras lose the depth information for where within the scene that object is positioned in terms of depth or distance from camera.
Conversely, stereo cameras have two or more lenses, each with a separate image sensor, and the two or more lenses allow the camera to capture three-dimensional images through a process known as stereo photography. With such conventional stereo cameras, triangulation is used to determine the depth to an object in a scene using a process known as correspondence. Correspondence presents a problem, however, of ascertaining which parts of one image captured at a first of the lenses correspond to parts of another image, captured at a second of the lenses. That is to say, which elements of the two photos correspond to one another as they represent the same portion of an object in the scene, such that triangulation may be performed to determine the depth to that object in the scene.
Given two or more images of the same three-dimensional scene, taken from different points of view via the two or more lenses of the stereo camera, correspondence processing requires identifying a set of points in one image which can be correspondingly identified as the same points in another image by matching points or features in one image with the corresponding points or features in another image.
This processing, however, is computationally intensive and therefore requires additional computing hardware to process higher quality imagery or necessitates a delay between image capture and correspondence processing completion from which the depth to an object may be determined and thus eliminates the possibility of real-time image processing as is required with moving video. Moreover, complexities are further introduced through variables such as movement of the camera, the elapse of time and/or movement of objects in the photos, variability in lighting conditions, and so forth. Still further, it may be that the scene from which the depth to an object is to be measured is nearly featureless, and as such, the correspondence processing cannot ascertain points which match to one another in the images. Consider for instance capturing images of a white wall or a featureless scene and trying to identify matching points within the scene. The correspondence processing will likely fail to identify sufficient correspondence points between the images thus making triangulation ineffective.
The present state of the art may therefore benefit from the systems, methods, and apparatuses for implementing maximum likelihood image binarization in a coded light range camera as is described herein.