The invention relates generally to capture of visual detail over a high dynamic range. In particular, the invention relates to sensing accurate tonal distinction for real-time high-speed video conditions. In present configuration, real-time implementation is not included, but conceivably could be. Also the inventive concept is not limited to high-speed video, but can be used on any imaging system for example a point-and-shoot camera, a cell phone, X-ray, infrared, security cameras etc.
Conventionally, the US Navy utilizes high-speed videography to capture energetic testing events for quantitative and qualitative assessment Frequently the optical acquisition of these events leads to the oversaturation of the camera sensor. Conventional manufacturers are aware of this problem and several systems are in place to mitigate the effects. Vision research utilizes an extended dynamic range mode, and Photron utilizes a dual slope mode. Both methods involve monitoring the pixel saturation over the course of some percentage of the integration time. If the sensor site is saturate the accumulated charge is sent to ground the cell acquires the scene for the remainder of the total integration time. These methods can help extend the dynamic range about 20 dB or a factor of ten. This provides most high-speed cameras a dynamic range of 80 dB or 10000:1.
The first use of high-speed photography dates back to 1851 when William Henry Fox Talbot exposed a portion of the London Times to a wet plate camera via spark illumination from Leyden jars. The first use of high-speed video was created by Eadweard Muybridge on Jul. 15, 1878 with twelve frames of a galloping horse exposed at 1/1000th second to prove whether or not all four hooves were off the ground at any given time. From that time onward high-speed videography has been used to capture, and in many cases quantify event information that happened too fast for humans to readily observe. In the 1980's, the development of the charge coupled device (CCD) marked the beginning of digital high-speed videography which advanced with the development of active pixel sensors. As memory capacity, speed, sensitivity, and noise suppression continue to advance, high-speed digital cameras are rapidly replacing the use of film due to their ease of use and satisfactory imagining ability throughout the defense, automotive, industrial, and research applications.
One of the challenges high-speed camera manufacturers face when compared to film is dynamic range. The dynamic range a digital camera is capable of acquiring is defined as the ratio of the deep well capacity to the noise floor of the camera. High-speed cameras must have high sensor gains, large fill factors, and large pixels to account for the very short exposure times required in high-speed videography otherwise they succumb to shot noise, read noise and dark current noise limitations. The high-speed operational requirements limit the total dynamic range that most high-speed digital cameras can operate within, typically close to 60 dB.
One assumption of this type of process is the scene is invariant to change such as movement of objects within the frame during all exposures. Such scene change leads to ghosting within the image due to the differences in the image series. With enough exposures to statistically define the static objects, the objects in motion can be statistically removed. This method fails if the object in motion needs to be recorded or there are not enough samples to remove ghost objects. This has led to significant challenges in the development of high dynamic range (HDR) video. Several researchers have formed different solutions such as rapid bracketing, split aperture imaging, and a Fresnel based imaging systems with multiple imagers used to correct the underlying problem. Frame bracketing is an effective method to capture HDR video; however the method requires that a camera runs significantly faster than the desired output videos playback, limiting the low light sensitivity. The camera must also be able to modify the integration time rapidly to capture every sequence. This method also assumes that objects in motion do not create a significant amount of blurring over the course of any of the bursts. Therefore this method only works for objects that move slowly relative to the longest integration time.
The Fresnel based system utilizes multiple images through the use of a Fresnel beam-splitter that parses the light to the various sensors. This system showed significant promise but does not work if the source of light to be viewed is polarized or in a harsh environment. In 2011, Tocci et al. showed how a system comprised of a field programmable gate array (FPGA) and a pair of cameras could be used for the creation of real time tone-mapped HDR video. By having the cameras observe the same field and selecting which related pixels from the camera pairs had the most accurate information, an aggregate of the selections could be used to display a welding process. This method enabled the direct visualization of the weld, melt pool, and materials being welded in their demonstration.
In 2011, NASA used six high-speed cameras to video the STS-134 launch and five high-speed cameras and one thermal camera to visualize the STS-135 launch. NASA had a camera shelter with six high-speed cameras running at 200 fps set at six different exposure times. The cameras were oriented to form a 3×2 grid. The subjects being filmed were sufficiently far relative to the camera spacing that parallax error was not a problem, and the first high-speed HDR video known to the author was formed. However, NASA's system would be subject to parallax errors if it were to be used to film something closer to the lens.
High dynamic range (HDR) imaging was developed in response to the frequent breach of the upper limit of a single optical sensor's acquisition range. Through photo bleaching and pupil dilation, the human eye adapts to a scene's range of light to compensate for its own limitations. The adaptation mechanism extends the human vision systems functional range to about 12 orders of magnitude spanning about a 10,000:1 range for a given background radiance. A solution to the image acquisition problem was first presented in 1962 when Wyckoff proposed the use of a composite film composed of layers of varying speed photosensitive material with similar spectral sensitivity in order to capture a high dynamic range scene. Each layer of film would be printed to a different color and superimposed to form a pseudo color image that would represent how bright the various regions were. In the 1980's the invention of the charged coupled device created a wave of interest in digital imaging which suffered from an even smaller dynamic range than traditional film. In 1995, Mann and Picard introduced the concept of combining digital images of different exposures. They call each exposure a Wyckoff layer and is analogous to images taken using exposure bracketing. Mann and Picard begin their image formation process by generating a camera sensitivity curve from a comparison of differently exposed spatially aligned pixel values that were scaled by an unknown amount. The amalgamation of the unknown values' scaling parameters into a lookup table is known as the camera response function and can be used for the linearization of the acquired data. Mann and Picard introduced the concept of fitting the data to a power curve via parametric regression to complete the lookup table for all potential digital values. Mann and Picard also introduced the concept of using a weighted average as a method to reform an HDR image from the series of images.
In 1997 Debevec and Malik expounded upon Mann and Picard's multi image fusion ideology to form a robust method to fit the image response function to a smoothed logarithmic curve. Debevec and Malik introduced a weighted average as the maximum likelihood estimator that can be used to form the various images with their image response curve into one double precision representation of the linearly related sensor irradiance values. A weighted average is used because most forms of noise were assumed to be symmetrically distributed. The matrix of scaled irradiance values is referred to as an HDR image. In order to display the HDR image, the data must be tone-mapped into the display space of the output media, typically an 8-bit-per-color channel image. In the subsequent years several other methods, reviewed and compared by Akyüz, are developed to find the image response function that can be used in conjunction with the maximum likelihood estimator to form the HDR image.
Over the past two decades there have been several proposed weighting functions that will reduce the overall error on HDR image formation. They can be grouped into three categories; proportional to the acquired digital value, proportional to the slope of the camera response function for the given digital value, or proportional to a noise estimate on the estimated radiant exitance, which is the radiant flux emitted by a surface measured in watts-per-square-meter (W/m2). The first weighting function was introduced in 1995 by Mann and Picard, who defined a weighting function that would weigh the digital values proportional to the rate of change of the logarithm of the camera response function. This was done to reduce the effect of quantization and make the error appear uniform. In 1997, Debevec and Malik introduced the first weighting function that was proportional to the acquired digital value and takes the form of a hat (^) or caret function centered on the middle of the analog to digital conversion range. In 1999, Robertson et al. introduced a weighting function that was similar to a Gaussian curve assuming the sensor sensitivity would be highest around the center of the camera response curve and the extremes would present little to no usable information. Also in 1999, Mitsunaga first introduced the concept of incorporating a camera error model by creating a weighting function that is a first order approximation of the signal to noise ratio.
In 2001 Tsin et al. created the first statistical characterization to form a weighting function. In general there were three error terms. The first term modeled the thermal noise, the second term incorporated quantization error and amplifier noise, and the third term modeled the contribution from shot noise. In 2005 Ward proposed a modification to Mitsunaga's weighting function by multiplying it with a broad hat model to reduce the relative importance of the pixel values at the extremes of the acquisition range. In 2006 Kirk introduced the first weighting method that is exactly the reciprocal of the square of the error as the weighting function that minimizes the error on formation. In 2007 Akyüz proposed a modification to Ward's weighting function. Instead of the weighting function being a function of the digital value, the weighting function should be a function of the estimated radiant exitance. In 2010 Granados and Hasinoff introduced weighting functions designed to minimize the error on formation by accurately modeling the variance in the system. To accomplish this, a recursive scheme was developed to accurately model the shot noise. Granados takes into account fixed pattern noise, read noise, shot noise, and dark current noise. Hasinoff's weighting method includes shot noise, read noise, and quantization error.