Cameras
There are two well-known techniques for generating digital images. In the first technique, an analog film is exposed to an energy field, for example, visible light. The film is developed and digitized using a scanner. The resulting image includes pixel values that reflect the intensity distribution of the energy field. The image can also show the frequency distribution, or ‘color’ of the energy field.
In the second technique, an array of digital sensors is arranged on an image plane in an energy field. The sensors measure the intensity of the energy directly. The sensors can also be made frequency selective by using filters. As an advantage, the second technique produces results immediately and does not consume any film. For these and other reasons, digital cameras are rapidly replacing analog cameras.
In a conventional analog camera, the light sensitive material is film. In a conventional digital cameras, the light sensitive material is a charge coupled device (CCD). The amount of light reaching the film or CCD is known as exposure. Exposure is a function of the aperture size and shutter speed. In some cases, the depth of focus may need to be sacrificed to obtain an acceptable exposure for a particular exposure time.
In the output digital image, the energy field is expressed as a grid of pixel values. Each pixel value corresponds, in the largest part, to the amount of sensed energy. However, the pixel values can also include the results of uncorrelated noise, e.g., noise due to heat, quantization errors, discrete electromagnetic flux, and imperfections in the sensors, circuits and process.
Both techniques impose the same choices and same tasks on the user, such as selecting a scene, a field-of-view, and a camera location. For more complicated cameras, that are not limited to simply ‘point-and-shoot’ capability, a user must also select the exposure time, lens aperture, and other settings. The settings are a compromise that best captures the appearance of the scene. In many poorly lit scenes, it is difficult to avoid over-exposure, under-exposure, noise, and motion blur due to camera limitations.
Both conventional analog and digital cameras have exposure, lens and sensor limitations. These limitations result in zoom-dependent chromatic aberration, color metamerism, color balance errors from mixed illumination, coupled zoom and focus adjustments, glare or ‘blooming’ of excessively brilliant objects, and lens flare caustics.
Alternate techniques to conventional intensity sensing cameras are also described in the prior art. Several ‘smart sensing’ chips integrate photo-detecting elements and processing circuits to obtain better performance, or to make the sensing and processing components more compact.
The silicon retina and adaptive retina described by Mead in “Analog VLSI implementation of neural systems,” Chapter Adaptive Retina, pages 239–246, Kluwer Academic Publishers, 1989, uses a chip-based model of the vertebrate retina.
Funatsu, et al., in “An artificial retina chip with a 256×256 array of N-MOS variable sensitivity photodetector cells,” Proc. SPIE. Machine Vision App., Arch., and Sys. Int., vol. 2597, pages 283–291, 1995, described modulation of an input image by directly modulating output from photo-detectors. However, most of those systems have special hardware designed for specific applications, such as feature detection. As a result, those systems only display selected features, and cannot reconstruct a complete output image from the original 2D intensity field.
With the development of high-speed, complementary metal oxide semiconductor (CMOS) imaging sensors, it became possible to acquire and process multiple input images before generating the output image. The imaging architecture that acquires multiple input images and produces a single output image is referred as ‘multiple capture single image’ (MCSI), see Xiao, et al., “Image analysis using modulated light sources,” Proc. SPIE Image Sensors, vol. 4306, pages 22–30, 2001.
Single instruction, multiple data (SIMD) processor arrays with programmable circuits that analyze continuous pixel values are available in the Ranger™ camera made by Integrated Vision Products, Wallenbergs gata 4, SE-583 35, Linkoping, Sweden. The Ranger cameras allow the user to upload microcode to operate on pixel values, see Johansson, et al. “A multiresolution 100 GOPS 4 gpixels/s programmable CMOS image sensor for machine vision,” IEEE Workshop on CCD and Advanced Image Sensors, 2003.
Another technique acquires MCSI images and decodes high frequency optical codes from strobing LEDs, even in the presence of ambient light, see Matsushita et. al., “Id cam: A smart camera for scene capturing and id recognition,” ISMAR, pages 227–236, 2003.
Image Generation
The number of image generation methods that are known is too large to detail here. Most methods operate directly on pixel intensity values. Other methods extract image gradients from the intensity values of the output image. The gradients are then further processed to produce images with high dynamic range (HDR) tone mapping, shadow removal, and other image editing operations.
Another technique is based on the observation that the human visual system is more sensitive to local contrast rather than absolute light intensities. That technique uses a tone mapping scheme for rendering high dynamic range images on conventional displays, see Fattal, et al., in “Gradient domain high dynamic range compression,” ACM SIGGRAPH, pages 249–256, 2002.
Another technique applies edge-based compression to images, see J. Elder, “Are Edges Incomplete?,” International Journal of Computer Vision, 34(2/3):97–122, 1999. Analysis of gradients, which are a zero-peaked histogram, provide schemes for super-resolution and image demosaicing from Bayer patterns, see Tappen, et al., “Exploiting the sparse derivative prior for super-resolution and image demosaicing,” 3rd Intl. Workshop on Stats. and Computl. Theories of Vision, 2003.
Techniques for acquiring HDR images have mostly relied on multiple exposures or on adjusting the sensitivity of each individual pixel according to the intensity of incident light, see Mann, et al, “On Being undigital with digital cameras: Extending dynamic range by combining differently exposed pictures,” Proc. of IST 46th Annual Conf., pages 422–428, 1995, Kang, et al., “High dynamic range video,” ACM Trans. Graphics, 22(3):319–325, July 2003, and Debevec et al., “Recovering high dynamic range radiance maps from photographs,” ACM SIGGRAPH, pages 369–378, 1997.
Log-responding cameras have been used to acquire HDR images. However, those output images have a reduced linear resolution at higher intensities.
One HDR technique performs an adaptive attenuation of pixel intensity values, and spatially varies pixel exposures, see Nayar, et al., “Adaptive dynamic range imaging: Optical control of pixel exposures over space and time,” Proc. Int'l Conf. Computer Vision, 2003.
Another imaging system uses a programmable array of micro-mirrors. The array enables modulation of scene rays, see Nayar, et al., “Programmable imaging using a digital micromirror array,” Proc. Conf. Computer Vision and Pattern Recognition, 2004.
Measurement Methods
Most conventional analog and digital cameras measure static light intensities. That is, the cameras average the intensities over time according to:Id(m, n)=(kIs(m, n))y,  (1)where Id is a normalized output intensity value, in a range 0.0≦Id≦1.0, at a ‘display’ pixel (m, n), Is is a sensed energy acquired at a corresponding photosensor (m, n), k is exposure, e.g., gain, light sensitivity or film speed, and y is contrast sensitivity. Typically, the contrast sensitivity y for CCDs is approximately one. If the sensitivity y is less than one, then the contrast is decreased, otherwise, if the sensitivity is greater than one, then the contrast is increased.
Equation (1) can be expressed logarithmically, where differences directly correspond to contrast ratios,log(Id)=y(log(Is)+log(k)).  (2)Equation 2 reveals that the contrast sensitivity y is a scale factor for contrast, and the exposure k, in logarithmic units, is an offset.
In most conventional cameras, the contrast sensitivity y is uniform across the entire acquired image. This ensures that each pixel value is within a predetermined range of intensity values. Pixel-to-pixel intensity variations in k and y have strong effects on the appearance of the output image.
However, as stated above, the display intensity Id also includes noise due to discrete photon arrivals, thermal noise in sensor devices, non-uniform sensor materials and circuit components, e.g., fixed-pattern noise, and outside interference, e.g., EMI/RFI, ‘light leaks,’ and noise induced by processing imperfections. Noise ‘hides’ the precise display value Id, so that A/D conversion precision beyond twelve to fourteen bits rarely improve assessments of the signal for sensors that are not artificially cooled. Many conventional digital cameras measure raw pixel sensor intensities as ten or twelve bit values.
Most conventional digital cameras are also “quasi-linear,” where the displayed values Id values are intended to be directly proportional to scene intensity values Is, but include some contrast compression, e.g., y=0.455, to compensate for the contrast exaggeration of conventional computer displays, e.g., y=2.2, so that resulting apparent contrast sensitivity is approximately one.
Specifying the exposure k by the scene intensity Is causes display intensity values Id=1.0 to display ‘white’. Contrast limitations of the display device then appear as a lack of details in dark areas of the display device.
The A/D resolution and contrast sensitivity y also set an upper limit on the contrast range of conventional quasi-linear digital cameras. With 2b uniform quantization levels for display intensity values Id, a fixed k and a fixed y, the largest ratio of scene intensifies the camera can acquire isCmax=Ismax/Ismin=2−b/y, or(log(1)−log(2b)=y(log(Ismax)−log(Ismin)).
Some conventional digital cameras can also imitate a ‘knee’ and ‘shoulder’ response of photographic film, where the contrast sensitivity y smoothly approaches zero for Id values near the extremes of 0.0 and 1.0 intensities. The less-abrupt termination of camera contrast response can preserve some detail in otherwise black shadows and white highlights. However, just as with film, those efforts are still insufficient to produce detailed images of HDR scenes.
Many interesting scenes contain contrasts that are far too large for most A/D converters. In those HDR scenes, conventional exposure-control methods often fail, and the user must unfortunately select which visible scene features are lost due to glaring white or featureless black.
Quantization levels for log-responding digital cameras follow Fechner's law. Fechner found intensity changes of about one or two percent are a ‘just noticeable difference’ for the human vision system. However, like conventional cameras, the quantization levels selected for log-responding cameras must span the entire dynamic range of scene intensities Is, and all scene intensities that are not within that range are irretrievably lost.
Therefore, it is desired to provide a camera for generating images that can reproduce high dynamic range details, and that overcomes the many problems of the prior art.