Three-dimensional (3D) cameras (or sensors) based on time-of-flight (TOF) principle acquire distance information from object(s) in a scene being imaged. Distance information is produced independently at each pixel of the camera's sensor. Exemplary such systems are described in U.S. Pat. No. 6,323,942 “CMOS-Compatible Three-Dimensional Image Sensor IC” (2001), and U.S. Pat. No. 6,515,740 “Methods for CMOS Compatible Three-Dimensional Image Sensing Using Quantum Efficiency Modulation” 2003, which patents are assigned to Canesta, Inc., presently of Sunnyvale, Calif.
As described in U.S. Pat. No. 6,323,942, a TOF system emits optical energy and determines how long it takes until at least some of that energy reflected by a target object arrives back at the system to be detected. Emitted optical energy traversing to more distant surface regions of a target object before being reflected back toward the system will define a longer TOF than if the target object were closer to the system. If the roundtrip TOF time is denoted t1, then the distance between target object and the TOF system is Z1, where Z1=t1·C/2, where C is velocity of light. Such systems can acquire both luminosity date (signal amplitude) and TOF distance, and can realize three-dimensional images of a target object in real time.
A more sophisticated TOF system is described in U.S. Pat. No. 6,515,740, wherein TOF is determined by examining relative phase shift between transmitted light signals and light signals reflected from a target object. FIG. 1A depicts an exemplary phase-shift detection system 100 according to the '740 patent. Detection of the reflected light signals over multiple locations in the system pixel array results in measurement signals that are referred to as depth images. The depth images represent a three-dimensional image of the target object surface.
Referring to FIG. 1A, TOF system 100 includes a two-dimensional array 130 of pixel detectors 140, each of which has dedicated circuitry 150 for processing detection charge output by the associated detector. In a typical application, array 130 might include 100×100 pixels 230, and thus include 100×100 processing circuits 150. IC 110 may also include a microprocessor or microcontroller unit 160, memory 170 (which preferably includes random access memory or RAM and read-only memory or ROM), a high speed distributable clock 180, and various computing and input/output (I/O) circuitry 190. Among other functions, controller unit 160 may perform distance to object and object velocity calculations.
Under control of microprocessor 160, a source of optical energy 120 is periodically energized via exciter 115, and emits optical energy via lens 125 toward an object target 20. Typically the optical energy is light, for example emitted by a laser diode, VCSEL (vertical-cavity surface emitting laser) or LED device 120. Some of the optical energy emitted from device 120 will be reflected off the surface of target object 20, and will pass through an aperture field stop and lens, collectively 135, and will fall upon two-dimensional array 130 of pixel detectors 140 where an image is formed. In some implementations, each imaging pixel detector 140 captures time-of-flight (TOF) required for optical energy transmitted by emitter 120 to reach target object 20 and be reflected back for detection by two-dimensional sensor array 130. Using this TOF information, distances Z can be determined. Advantageously system 100 can be implemented on a single IC 110, without moving parts and with relatively few off-chip components.
Typically optical energy source 20 emits preferably low power (e.g., perhaps 1 W peak) periodic waveforms, producing optical energy emissions of known frequency (perhaps 30 MHz to a many hundred MHz) for a time period known as the shutter time (perhaps 10 ms). Optical energy from emitter 120 and detected optical energy signals within pixel detectors 140 are synchronous to each other such that phase difference and thus distance Z can be measured for each pixel detector. The detection method used is referred to as homodyne detection in the '740 and '496 patents. Phase-based homodyne detection TOF systems are also described in U.S. Pat. No. 6,906,793, Methods and Devices for Charge Management for Three-Dimensional Sensing, assigned to Canesta, Inc., assignee herein.
The optical energy detected by the two-dimensional imaging sensor array 130 will include light source amplitude or intensity information, denoted as “A”, as well as phase shift information, denoted as φ. As depicted in exemplary waveforms in FIGS. 1B and 1C, the received phase shift information (FIG. 1C) varies with TOF and can be processed to yield Z data. For each pulse train of optical energy transmitted by emitter 120, a three-dimensional image of the visible portion of target object 20 is acquired, from which intensity and Z data is obtained (DATA). As described in U.S. Pat. Nos. 6,515,740 and 6,580,496 obtaining depth information Z requires acquiring at least two samples of the target object (or scene) 20 with 90° phase shift between emitted optical energy and the pixel detected signals. While two samples is a minimum figure, preferably four samples, 90° apart in phase, are acquired to permit detection error reduction due to mismatches in pixel detector performance, mismatches in associated electronic implementations, and other errors. On a per pixel detector basis, the measured four sample data are combined to produce actual Z depth information data. Further details as to implementation of various embodiments of phase shift systems may be found in U.S. Pat. Nos. 6,515,740 and 6,580,496.
FIG. 1D is similar to what is described with respect to the fixed phase delay embodiment of FIG. 10 in U.S. Pat. No. 6,580,496, entitled Systems for CMOS-Compatible Three-Dimensional Image Sensing Using Quantum Efficiency Modulation, or in U.S. Pat. No. 7,906,793, entitled Methods and Devices for Charge Management for Three-Dimensional Sensing, both patents assigned to Canesta, Inc., assignee herein. In FIG. 1D, generated photocurrent from each quantum efficiency modulated differential pixel detector, e.g., 140-1, is differentially detected (DIF·DETECT) and differentially amplified (AMP) to yield signals B·cos(φ), B·sin(φ), where B is a brightness coefficient.
During normal run-time operation of the TOF system, a fixed 0° or 90° phase shift delay (DELAY) is switchably insertable responsive to a phase select control signal (PHASE SELECT). Homodyne mixing occurs using quantum efficiency modulation to derive phase difference between transmitted and received signals (see FIGS. 1B, 1C), and to derive TOF, among other data. A more detailed description of homodyne detection in phase-based TOF systems is found in the '496 patent. Although sinusoidal type periodic waveforms are indicated in FIG. 1D, non-sinusoidal waveforms may instead be used. As described later herein, the detection circuitry of FIG. 1D may be used with embodiments of the present invention.
In many applications it is advantageous to have geometric information as such information makes it easier to perceive and interact with the real world. As noted, three-dimensional TOF camera systems including exemplary system 100 in FIG. 1A accomplish this task using a modulated light source 120 (e.g., an LED, a laser, a VCSEL, etc.) to illuminate a scene containing a target object 20. The light reflected from the scene is processed in the camera's sensor pixels to determine the phase delay (φ) between the transmitted light and reflected light. Phase delay (or simply phase herein) is proportional to the (Z) distance between the sensor and the target. However phase delay is a relative quantity and is not per se equal to Z distance. For example as Z increases, phase φ increases, but after an increase of 360°, the phase folds-over and further increases in Z will produce further increases in φ, again starting from 0°. It is thus necessary to disambiguate or de-alias the phase data to obtain a true measure of Z.
Furthermore, the sensor's pixels measure phase delay along a certain radial angle that is different for each pixel 140 in array 130. However many applications prefer using Cartesian (or real world X,Y,Z) coordinates instead of radial information. A mechanism is needed to establish correspondence or mapping between phase and real world coordinates. Such a mechanism is obtained through a calibration process.
Thus, one function of calibration may be defined as creating a mapping from the sensor 140 response to geometrical coordinates, which are X, Y, and Z information with respect to a known reference. As used herein, X and Y coordinates are the horizontal and vertical offsets from the optical axis of the system, and Z is the perpendicular distance between the sensor and the target object (e.g., object in a scene). Typically the calibration process includes several steps, where each step creates one kind of mapping. For instance, the mapping for real-world Z coordinates is done by a step called Z (distance or depth) calibration, while the mapping for real-world X,Y coordinates is done by another step called XY calibration.
In addition to geometrical calibration, one must perform other types of calibration to account for certain environmental factors, including without limitation temperature and ambient lighting conditions. For example, temperature changes in sensor array 130 can increase so-called dark current in pixels 140, which dark current can in turn change measured phase φ. Ambient light can interfere with system-emitted light from source 120, and can result in phase errors. A complete calibration procedure preferably will include steps to model the effects of such environmental changes. So doing can allow these effects to be removed dynamically during run-time operation, when the environmental conditions may change.
Consider for example distance (Z) calibration techniques, according to the prior art. One known calibration method for a three-dimensional system captures sensor phase response for a number of known Z distance values as the target object is successively moved or relocated in the XY plane. This prior art calibration method will be referred to herein as the “by-example” method. Using this method sensor data from array 130 are captured for each target object location and stored in memory. The resultant phase-vs.-distance curve is constructed as a calibration table of sensor response-distance pairs that is sampled at several values of distance. During actual run-time operation of the TOF system so calibrated, perhaps system 100, the stored calibration table data is interpolated and bracketed to determine Z distance for a given sensor phase response. Thus, a given phase response from the sensor array is converted to distance by interpolating the values stored in the calibration table. However the phase-vs-distance transfer function curve contains harmonics and sufficient data points must be stored in the calibration table to model these harmonics to avoid loss of accuracy due to insufficient sampling. There is also interpolation error that can only be reduced by increasing the size of the table.
Although the “by-example” method is straightforward to implement with relatively fast run-time processing, it has several disadvantages. Taking a subset of the operating range and subsequent interpolation results in errors that can be several cm in magnitude. Further, as the operating range of the sensor is increased, more data must be stored in the calibration table to maintain accuracy. This generates larger calibration tables, requiring more storage, as well as longer interpolation times. Storage can be on the order of several MB, e.g., very large for use with embedded systems. Another problem from a practical standpoint is the large physical space needed to capture data from the sensor for large field of view (FOV) and operating ranges as the target object is repositioned. For example, a sensor with a 100° FOV and 5 m operating range requires a target object of approximately 12 m×12 m, which target object must be moved between 0 and 5 m during calibration. Given enough physical space for target object relocation during calibration, and given enough time for the calibration procedure, such prior art “by example” calibration can be carried out. But such prior art calibration procedure has high costs and is not very suitable for calibrating a high-volume product.
What is needed are more efficient methods and systems to implement detected phase to distance calibration for three-dimensional camera systems. Such methods and systems should require less time and smaller physical space to be carried out, and the calibration data should require less space for storage for use during system run-time operation. Preferably such calibration should provide a first model that depends upon electrical rather than physical characteristics of the sensors in the system under calibration, and should provide a second model that depends upon physical rather than electrical characteristics of the sensors.
The present invention provides such methods and systems.