An image sensor has a large number of identical sensor elements (pixels), generally greater than 1 million, in a Cartesian (square) grid. The distance between adjacent pixels is called the pitch (p). The area of a pixel is p2. The area of the photosensitive element, i.e., the area of the pixel that is sensitive to light for conversion to an electrical signal, is normally only about 20% to 30% of the surface area of the pixel.
The challenge of a designer is to channel as much of the light impinging on the pixel to the photosensitive element of the pixel. There are a number of factors that diminish the amount of light from reaching the photosensitive element. One factor is the manner in which the image sensor is constructed. A complementary metal oxide semiconductor (CMOS) image sensor is manufactured by a process of etching and depositing a number of layers of oxides of silicon, metal and nitride on top of crystalline silicon. The layers of a typical sensor are listed in Table I and shown in FIG. 1.
TABLE ITypical LayerDescriptionThickness (μm)15OVERCOAT2.0014MICRO LENS0.77313SPACER1.4012COLOR FILTER1.2011PLANARIZATION1.4010PASS30.6009PASS20.1508PASS11.007IMD5B0.3506METAL331.185IMD2B0.2004METAL221.183IMD1B0.2002METAL11.181ILD0.750
In Table I, typically the first layer on a silicon substrate is the ILD layer and the topmost layer is the overcoat. In Table I, ILD refers to a inter-level dielectric layer, METAL1, METAL2 and METAL3 refer to different metal layers, IMD1B, IMD2B and IMD5B refer to different inter-metal dielectric layers which are spacer layers, PASS1, PASS2 and PASS3 refer to different passivation layers (typically dielectric layers).
The total thickness of the layers above the silicon substrate of the image sensor is the stack height (s) of the image sensor and is the sum of the thickness of the individual layers. In the example of Table I, the sum of the thickness of the individual layers is about 11.6 micrometers (μm).
The space above the photosensitive element of a pixel must be transparent to light to allow incident light from a full color scene to impinge on the photosensitive element located in the silicon substrate. Consequently, no metal layers are routed across the photosensitive element of a pixel, leaving the layers directly above the photosensitive element clear.
The pixel pitch to stack height ratio (p/s) determines the cone of light (F number) that can be accepted by the pixel and conveyed to the photosensitive element on the silicon. As pixels become smaller and the stack height increases, this number decreases, thereby lowering the efficiency of the pixel.
More importantly, the increased stack height with greater number of metal layers obscure the light from being transmitted through the stack to reach the photosensitive element, in particular of the rays that impinge the sensor element at an angle. One solution is to decrease the stack height by a significant amount (i.e., >2 μm). However, this solution is difficult to achieve in a standard CMOS process.
Another issue, which possibly is the one that most limits the performance of the conventional image sensors, is that less than about one-third of the light impinging on the image sensor is transmitted to the photosensitive element such as a photodiode. In the conventional image sensors, in order to distinguish the three components of light so that the colors from a full color scene can be reproduced, two of the components of light are filtered out for each pixel using a filter. For example, the red pixel has a filter that absorbs green and blue light, only allowing red light to pass to the sensor.