Recent improvements in imaging include the use of depth information for producing three dimensional images. In consumer applications, and in particular in gaming systems, depth image sensing has become increasingly used in commercial products following the recent introduction of game systems that sense player motion, such as the Kinect system, from Microsoft. In these systems, players can use motion of their hands, heads, or bodies to interact with the system and cause the image displayed on their television by the game to change, instead of using only a handheld controller. Gestures may be used to control the system instead of keystrokes or controller buttons. 3D depth sensing enables natural user interaction with machines using hand and body gestures, instead of keyboard entry or buttons for example. Many other applications are emerging such as hand or body control of computers and other portable devices, intelligent machine vision applications, and surveillance and analytics.
Depth imaging involves imaging a scene in front of an imager (a camera or image sensing system) in three dimensions: x, y and z. The depth image sensor can detect movement for gesture control and other three dimensional (3D) machine vision applications.
In the known prior solutions, depth imaging requires highly specialized technology. A dedicated imaging sensor that was produced using advanced semiconductor processes, a specialized illumination system, and/or specialized information processing systems have been required.
In the approaches known to date, stereo-vision, structured light, or a dedicated Time of Flight (“TOF”) sensing schemes were used to determine the depth information. Stereo vision systems require two sensors, spaced apart, and also require very intensive post processing of the images to extract the 3D information. Structured light systems require a specialized diffraction emitter and further, a dedicated post processing processor (yet another specialized integrated circuit or chip).
TOF sensing uses the time delay involved in the transit of light from an illumination source, or emitter, to the objects in a scene and back to the sensor pixels. The transit time is measured as a phase shift. The phase shift can be used to calculate the distance, i.e., depth. In the known prior systems, the TOF measurements use a dedicated image sensor optimized for the depth image. This dedicated sensor is produced in a modified semiconductor process technology, for example a specialized modification of a CMOS pixel structure. The dedicated sensor is used to demodulate the incoming light and thus measure the phase shift of the incoming light pulse (relative to the outgoing light source). The known methods require significant and costly additional hardware, either in using the dedicated and specialized depth image sensors, or in significant post processing solutions, that are far more complex and costly than the hardware and processors needed to produce a conventional two dimensional imaging system, such as a digital camera. Further in applications that require both depth image sensing for 3D and a visual or display image, such as for example a two dimensional (“2D”) color image; multiple sensors are needed for producing the two kinds of images. This is the case for conventional 3D gesture control applications, such as gaming consoles, which are currently known.
FIG. 1 depicts, in an example system diagram, an illustration of the TOF measurements as performed using known prior approaches. In FIG. 1, a system 10 is shown including a controlled infrared illumination source 13 that is coupled to a sensor 15. Optic lens 17 collects light and focuses the collected light onto the sensor 15. A person 11 is shown standing in front of the sensor and the illumination 15.
In performing depth imaging, a time of flight (TOF) measurement may be used. The controlled illumination 15, which may be one or an array of infrared (“IR”) LED elements, for example, is turned on. The time that the IR illumination takes to leave the illumination system, reach the objects in the scene, and return as reflected light to the sensor 13 is the ‘time of flight’. Using the speed of light, one can see that this time of flight correlates to the time the light takes to travel twice the distance that the person 11 in the scene is from the sensor 13. That distance is the depth of the object from the sensor (depth into the scene).
In conventional approaches, TOF measurements are made by direct measurement. In this approach the time is measured using a digital or analog time keeper. Alternatively, the TOF measurement can be made indirectly, by obtaining the phase difference between the emitted light (from the illumination source) and the received modulated light (reflected from the objects in the scene). In most conventional systems the phase difference approach is used. The phase difference approach is particularly popular for depth image applications which use relatively short distances, such as 1 meter to 20 meters.
In order to optimize the depth imaging, several specialized pixel structures have been fabricated. Known approaches are illustrated, for example, by systems made and sold by PMD Tech, described at the world wide web url address: http://www.pmdtec.com/technology/unique_features.php/. PMD Tech apparently offers a dedicated integrated circuit produced in a CMOS process that is optimized for depth imaging. Another similar solution is offered by Softkinetic, described at the world wide web url address: http://www.softkinetic.com/products/depthsensesensors.aspx. Another known approach is offered by Mesa Imaging, described at the world wide web url address: http://www.mesa-imaging.ch/products/product-overview/.
The prior known approaches require the use of a modified CMOS semiconductor process to build a dedicated depth imaging sensor. While the TOF measurements are available from using such a specialized sensor, in a typical application for depth imaging, an image for visual imaging of reasonably high quality is also desired. In order to provide both the depth imaging features and a visual display image such as a 2D color image, the prior known approaches also require, in addition to the costly sensor solutions for the depth imaging, a second image sensor that includes another imaging sensor for a visual image, and optics to use with that sensor. Thus in order to meet the needs of a typical application, the known prior system requires two complete image sensors (and the associated hardware and software) to form the desired 3D image.
Improvements in the methods used to produce depth images for 3D applications, as well as images for visual display, are therefore needed. Solutions for depth and visual imaging are needed that do not require additional proprietary and expensive depth image sensors, and which are cost effective, simple to implement, robust and ready to market.