Three dimensional image capture has a variety of uses, including in areas such as virtual reality, object modeling, and/or general image capture fields. To perform such three dimensional image capture, there are a number of possible solutions, including time of flight based image capture, structured light based image capture, and stereo vision. Each of these processes vary in terms of computational complexity, number of sensors required, available resolution (e.g., image quality), whether color images are available, and whether an additional light source is required.
For example, in the case of a time of flight image capture process, a light travel distance is used, and by measuring time of flight, a depth distance in an image can be calculated. With increased time granularity, finer depth calculations can be made. However, to achieve depth accuracy to within a millimeter, typically measurement must be made at the picosecond level, which requires substantial computational resources. Additionally, a special-purpose sensor may be needed (e.g., an SPAD array). In such cases, a larger pitch of such a special sensor may limit the X-Y direction resolution, limiting image quality. Still further, in some cases, a special purpose light source, such as a VCSEL (Laser array) or LED array (e.g., NIR LED) may be required.
In the case of structured light based image capture, a pattern of light is projected on a subject, and deformation of the light pattern by the subject is observed to detect a shape of the object. A camera offset from the pattern projector can review the shape of the pattern and calculate a distance/depth for each point within a field of view. Such systems are generally fast and relatively accurate since they can scan multiple points or an entire field of view at once; however, such systems require a very specific illumination source to accomplish depth calculations.
As such, many optical depth detection systems employ stereo vision techniques. Such systems typically employ two or more video cameras spaced from each other. By analyzing slight differences between images captured by each camera, a distance at each point in the images is possible. Although this does not require an additional light source (as in the case of a structured light based image capture process), it does require two sensors and significant computation to identify a matching point at which the two or more cameras can be focused. Matching points may be difficult to identify for objects with little or no texture.
As seen above, where additional light sources or sensors are required, cost, power, and computational complexity are all generally increased. However, in typical scenarios, such additional light sources or sensors are required for improved image quality.