CMOS image sensors are used in, for example, video cameras, and generally include a two dimensional array of pixels that is fabricated on a semiconductor substrate using standardized CMOS fabrication techniques. Each pixel includes a sensing element (e.g., a photodiode) that is capable of converting a portion of an optical image into an electronic (e.g., voltage) signal, and access circuitry that selectively couples the sensing element to control circuits dispose on a periphery of the pixel array by way of metal address and signal lines. The metal address and signal lines are supported in insulation material that is deposited over the upper surface of the semiconductor substrate, and positioned along the peripheral edges of the pixels to allow light to pass between the metal lines to the sensing elements through the insulation material. CMOS image sensors typically contain millions of pixels which transform photons coming from a photographed scene into millions of corresponding voltage signals, which are stored on a memory device and then read from the memory device and used to regenerate the optical image on, for example, a liquid crystal display (LCD) device.
A conventional method for utilizing a CMOS image sensor to capture an image involves detecting the amount of light applied to each pixel using a fully pinned photodiode (PD) which enables the read charge using a correlated double sampling (CDS) methodology. The CDS methodology includes an integration phase and a readout phase. The integration phase includes “resetting” the charge on a particular photodiode (i.e., full transfer of all electrons in the photodiode to the system voltage source (VDD)), then decoupling the photodiode from the voltage source for a predetermined integration time, and then measuring the collected charge at the end of the integration time. During the integration time photoelectrons accumulate at the PD, with the rate being directly proportional to the amount of light received by the photodiode. A floating diffusion (FD) can be coupled to the photodiode by a transfer gate (TG) transistor or to the VDD by a reset transistor. The CDS readout phase involves performing two sample and hold (S/H) operations. The first S/H operation involves coupling the FD to VDD and measuring the resulting voltage on the FD to provide a S/H reset value, which is used as a reference voltage. Next, the FD is coupled to the photodiode by turning on the TG so that all photoelectrons are transferred from the PD to the FD, causing the FD voltage to drop. The second S/H operation (S/H signal) is performed immediately after all photoelectrons are transferred from the photodiode and the TG is deactivated, and again involves measuring the resulting voltage on the FD to provide a S/H signal value. Since the reference voltage exists both in S/H reset and S/H signal values, subtracting the two values results a noiseless signal value that accurately represents the amount of light received by that pixel.
Although CMOS image sensors have some merits compared to the human eye (capture speed or performing relatively well at extreme environment conditions), the human eye currently performs better when it come to image processing or dynamic range than CMOS image sensors operated using conventional CDS methods. Dynamic range is defined as the largest signal (in the non-saturated region) in the pixel divided by smallest signal which can be correctly detected under dark conditions (typical dominated by the sigma of the temporal noise of read circuits). The human eye typically can capture 90 db of scene dynamic range while standard image sensor for imaging application is capable of recording between 60 to 72 db in its linear operating range. Problems associated with correctly capturing (i.e., “photographing”) the dynamic range in a scene are known from the early days of photography, where photographs used to underexpose a photography film in order to capture highlight (bright) details of a scene, and “overexpose” a film in order to observe lowlight (dark) details in the scene. Although CMOS image sensors have improved significantly in the last decade in their ability to observe details in the dark (lowlight) areas of the scene (mainly by reducing the electronic read out noise, for example, with the use of pinned diode-type photodiodes with CDS), the dynamic range of CMOS image sensors still remains well below that of the human eye in their ability to capture all details in an uncontrolled lighting environment from shadows to bright areas using one exposure. That is, photodiodes exhibit a linear operating range in relatively low exposure (exposure is the flux of light over a given integration time), wherein the charge at the end of the integration time is directly proportional to the amount of received light. In contrast, when exposure conditions exceed the linear operating range of the photodiode (i.e., the light is too bright) and the photodiode approaches saturation during the integration time, the photodiode begins to react in a nonlinear manner, or stops collecting electrons altogether. In this case the pixel cannot represent correctly the amount of received light; moreover, the cross-over point between linear region and saturation is not well defined, and this causes the spatial noise to rise significantly from a typical 0.8% to more than 5%. When the light reaches a maximum brightness, the photodiode becomes entirely saturated during the integration time, and essentially the same readout signal is produced for all light having the maximum brightness or higher.
There are several known methods to increase dynamic range of CMOS image sensor pixel beyond its normal linear range (herein “Wide Dynamic Range” or “WDR” methods) including time-to-saturation, multiple capture, synchronous self reset with multiple capture, and asynchronous self reset with multiple capture. Of these WDR methods, the present invention is focused on the time-to-saturation (herein TTS) method, which is known in the art and is described briefly below. Additional description and a discussion regarding the advantages and disadvantages of all four of the methods mentioned above are described, for example, in, “Quantitative Study of High-Dynamic-Range Image Sensor Architectures,” S. Kavusi and A. El Gamal, Proceedings of the SPIE, vol. 5301, pp. 264-275, June 2004.
The TTS method achieves high dynamic range with high signal-to-noise ratio (SNR) by converting each photocurrent into its time-to-saturation tsat(iph) according to Equation 1 (below):
                              t          sat                =                              q            ×            Full_Well                                i            ph                                              (                  Eq          .                                          ⁢          1                )            In Equation 1, Full_Well is the maximum well capacity of the pixel in its linear range, and q is the electron charge. In effect, the TTS method involves deriving the amount of illumination on a pixel by determining how long it takes for the photodiode to become saturated, where short tsat values indicate a relatively bright image region (high illumination), and long tsat values indicate a relatively dark image region (low light). In one conventional TTS pixel (e.g., see “Design and fabrication of a high dynamic range image sensor in TFA technology,” T. Lul'e, B. Schneider, and M. Bohm, IEEE Journal of Solid-State Circuits 34(5), pp. 704-711, May 1999), each tsat value is determined by integrating the photodiode current on an integration capacitance. On every rising edge of the clock input, this voltage is compared to a reference voltage. If the integrated signal is smaller from the reference the integration time is extended. If the signal is higher than the reference the comparator terminates the integration via the switches. With every clock, the time-stamp input climbs up one step and is sampled and held in the timestamp capacitance at the moment the integration is terminated. The information at every pixel consists of two voltages that are read out: the integrated signal and the time stamp voltage, where the time stamp voltage stamp is only important if the pixel did not reach saturation at the end of integration). This TTS method is thus contrary to the conventional CDS method, discussed above, where a constant integration time is used, and where each pixel integrates the photocurrent into charge (Q(Iph)=tint×Iph). In the conventional CDS case, although the pixel size is small, it is clear that there will be no additional data on the local illumination intensity if the pixel reaches a charge above its capacity (Qmax=q×Full_Well). This limitation is avoided using the TTS by timing how long it takes before saturation of the photodiode charge is reached.
Although the TTS method avoids the limitations of conventional CDS methods, TTS pixels are generally impractical for high resolution sensors because they typically require large amount of transistors to perform the TTS operation (e.g., to provide both the integrated signal and time stamp values). A similar problem exists for the other methods used increase dynamic range of CMOS image sensor pixel beyond the normal linear range.
What is needed is a pixel for a CMOS image sensor that both provides a high dynamic range (i.e., 90 db or greater), and has a low fill-factor such that the pixel can be used in the production of CMOS image sensors having very high resolution