Acquiring 3D geometric information from real environments is an essential task for many applications in computer graphics. Prominent examples such as virtual and augmented environments and human machine interaction, e.g. for gaming, clearly benefit from simple and accurate devices for real-time range image acquisition. However, even for static scenes there is no low-price off-the-shelf system that provides full-range, high resolution distance information in real time. Laser triangulation techniques, which merely sample a scene row by row with a single laser device, are rather time-consuming and therefore impracticable for dynamic scenes. Stereo vision camera systems suffer from the inability to match correspondences in homogeneous object regions.
Time-of-Flight (ToF) technology, based on measuring the time that light emitted by an illumination unit requires to travel to an object and back to a detector, is used in LIDAR (Light Detection and Ranging) scanners for high-precision distance measurements. Recently, this principle has been the basis for the development of new range-sensing devices, so-called ToF cameras, which are realized in standard CMOS or CCD technology; in the context of photogrammetry, ToF cameras are also called Range Imaging (RIM) sensors. Unlike other 3D systems, the ToF camera is a very compact device which already fulfills most of the above-stated features desired for real-time distance acquisition. There are two main approaches currently employed in ToF technology. The first one utilizes modulated, incoherent light, and is based on a phase measurement. The second approach is based on an optical shutter technology, which was first used for studio cameras and was later developed for miniaturized cameras
Within the last three years, the number of research activities in the context of ToF cameras has increased dramatically. While the initial research focused on more basic questions like sensor characteristics and the application of ToF cameras for the acquisition of static scenes, other application areas have recently come into focus, e.g. human machine interaction and surveillance.
A variety of safety-enhancing automobile features can be enabled by digital signal processors that can sense and analyze the dynamic 3D environment inside and outside the vehicle. Safety features may include collision warning and avoidance, smart airbag deployment, obstacle detection such as backup warning, and parking assistance. Common to these applications is the need to detect, isolate, measure, locate, recognize, and track objects such as people, traffic, and roadside features.
It is often proposed to perform these tasks using conventional 2D imaging sensors and analysis software, but achieving cost-effective and reliable performance during all vehicular usage scenarios is a formidable challenge. The appearance of objects in a 2D image varies greatly, depending on illumination conditions, surface materials, and object orientation. These variations in the image complicate the task of software that must interpret the scene. On the other hand, the 3D shape of objects is invariant to those confounding effects.
Stereovision based 3D recovery is computationally complex and fails on un-patterned surfaces. RADAR, ultrasonic, scanning LADAR, and other ranging technologies are similarly proposed, but they have difficulty discriminating objects due to limited temporal or angular resolution; moreover, the need for specialized sensors for each safety function poses system integration challenges. A single high frame rate focal-plane-array 3D sensor is desirable because it can serve multiple safety and convenience functions simultaneously, allowing applications to jointly exploit shape and appearance information in a dynamic scene. The output of the sensor should be a sequence of 2D arrays of pixel values, where each pixel value describes the brightness and Cartesian X,Y,Z coordinates of a 3D point on the surface of the scene.
Growing government legislation, increasing liability concerns, and the inevitable consumer desire for improved safety make the introduction of new safety features a high priority for automakers. Today, various sensing technologies play a key role in delivering these features, detecting conditions both inside and outside of the vehicle in applications like parking assistance, adaptive cruise control, and pre-crash collision mitigation. Each of these applications is characterized by a unique customized technology (e.g. ultrasonic, RADAR, LADAR, digital image sensing, etc.), which generally provides either a ranging function or an object recognition function.
The need for investment in multiple disparate technologies makes it challenging to deploy individual safety features as quickly or as broadly as desired.
Future applications pose even more difficulties, as multiple features must be provided in a single vehicle. Plus, virtually all of the new sensing applications on automakers' roadmaps (e.g. pedestrian detection being planned in Europe and Japan) require both ranging and object recognition functions. Combining two incongruent technologies to accomplish this task (such as RADAR and digital image sensing) is expensive, difficult to implement, and poses the additional problem of inefficient development.
The use of vision gives added levels of discernment to the air bag systems by providing static or dynamic occupant classification and position sensing. Further, the addition of a vision system inside the cabin enables other value-added applications such as abandoned baby/pet detection, personalization, and security. Applications for vision-based sensing outside the car are blind spot detection, vehicle lane departure, safety in rear vision, proximity of other vehicles around the vehicle, and off road and heavy equipment proximity sensing. The benefits of vision sensors are two fold. They provide enhanced visual feedback to assist the driver in operating the vehicle. But more importantly, when vision sensors also provide range data, they provide the necessary information for advanced algorithms to achieve higher level of discernment and more accurate analysis of object motion dynamic. With such sensors, for instance, the system can use the shape differences between a person and a large box sitting in the front seat to deploy the air bag or not.
In addition to depth values, ToF cameras also provide intensity values, representing the amount of light sent back from a specific point.
Due to the periodicity of the modulation signal, ToF cameras have a defined non-ambiguous range. Within this range, distances can be computed uniquely. The range depends on the modulation frequency of the camera which defines the wave length of the emitted signal. As shown in FIG. 1, to compute distances the camera evaluates the phase shift between a reference (emitted) signal 101 and the received signal 102. The phase shift is proportional to the distance d.
Currently most ToF cameras operate at a modulation frequency of about 20 MHz for gaming, TV control gesture and digital signage etc. Then, a single wavelength is 15 meter, and the unique range of these ToF cameras is approx. 7.5 meter. This frequency may be changed for automotive use where a car moves at 60 miles per hour or faster to obtain a longer range coverage. The range can be adjusted by adjusting the modulation frequency of the active illumination depending on automobile's speed.