The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also correspond to embodiments of the claimed subject matter.
Conventional cameras capture a single image from a single optical focal point and are enabled to capture pixels corresponding to an object in a scene, but in so doing, such cameras lose the depth information for where within the scene that object is positioned in terms of depth or distance from the camera.
Conversely, stereo cameras have two or more lenses, either on the same or separate image sensors, and the two or more lenses allow the camera to capture three-dimensional images through a process known as stereo photography. With such conventional stereo cameras, triangulation is used to determine the depth to an object in a scene using a process known as correspondence. Correspondence presents a problem, however, of ascertaining which parts of one image captured at a first of the lenses correspond to parts of another image, captured at a second of the lenses. That is to say, which elements of the two photos correspond to one another as they represent the same portion of an object in the scene, such that triangulation may be performed to determine the depth to that object in the scene.
Given two or more images of the same three-dimensional scene, taken from different points of view via the two or more lenses of the stereo camera, correspondence processing requires identifying a set of points in one image which can be correspondingly identified as the same points in another image by matching points or features in one image with the corresponding points or features in another image.
Other three-dimensional (3D) processing methodologies exist besides correspondence based triangulation, such as laser time of flight and projection of coded light.
When determining depth to an object in a scene, the detectors need to receive light from the scene by which the objects may be observed such that depth can be determined. Many scenes, however, lack sufficient ambient light within the scene, especially as objects are further distant from the camera, or the scene observed by the camera is large, or where the natural light within the scene being imaged is scarce, such as is common with indoor environments.
Certain 3D imaging and depth sensing systems have incorporated a laser projector to improve correspondence processing by providing both assisted lighting of the scene as well as providing artificial texturing of the scene, however, the conventional solutions applied to 3D imaging and depth sensing systems suffer from a variety of drawbacks.
Fundamentally, there is a risk with regard to use of laser projection in 3D camera technologies, especially in the consumer space, due to the very simple fact that a laser may easily surpass a safety limit for their use and thus pose a very real threat of causing injury or even blindness to a human subject.
The present state of the art may therefore benefit from the systems, methods, and apparatuses for implementing a stereo depth camera using a VCSEL projector with spatially and temporally interleaved patterns as is described herein.