One of the many surprises to an amateur photographer is the way a back lit scene looks on a photographic print. To the best of his/her recollection, in the original scene there was nothing like the huge brightness difference on the print between the highlight and the shadow on the subject's face. Our visual systems apparently do not process the flat two dimensional photographic print the same way as they do for the three dimensional scene image. For this reason, one has to process the image in a compensatory way before it is printed on paper in order to make it look like the original scene. Another problem with printing an image having a wide dynamic range in luminance on paper is the limitation inherent in any reflection material, that is, the narrow useful dynamic range is limited by flare light.
Since a good camera/film system can easily record a density range of 1.6 with a gamma equal to 0.65, the recorded useful exposure range is about 300:1. However, the density range on a photographic print, over which we can easily see image detail is from 0.12 to 1.8, which gives a luminance range of 50:1. There is more information on the film than can be effectively expressed on the print without some type of dynamic range compression. A common way to deal with this problem is the darkroom burning and dodging operation, which is the process to change the local exposure on paper by moving a piece of opaque material in front of it. This process essentially its surround. It takes skill, patience, and time to produce a good print with such a process.
If one can scan the film image to convert it to digital form, one can do similar types of manipulation faster and with more precise controls with the help of a high speed computer. The present invention is an interactive dynamic range adjustment system with associated software which has been implemented on a workstation with a keyboard and a mouse for user input. For an image with a wide dynamic range, one cannot simply compress the image signal proportionally at every point because, by doing so, the processed image will look very "flat" and visually unacceptable. The basic idea is to separate the image into its low frequency and high frequency components, and only perform the compression on the low frequency component. The invention relies on an understanding of the image processing of the human visual system when it is dealing with a wide dynamic range scene.
The simple analogy that our eye functions like a camera overlooks the enormous image processing going on in the retina and the brain. Not only can the eye adapt to see over a luminance range of at least a billion to one, but it also can give a fairly constant perception of color and brightness despite the large variation in the illumination across the scene.
One of the visual adaptation processes occurs in the photoreceptors. There appear to be two mechanisms that operate during photoreceptor light adaptation, allowing the cell to continue to respond from very dim light to very bright light. First, when receptors are illuminated with continuous light, the membrane potential gradually and partially returns towards dark levels, bringing the receptor below its saturation level and making it capable of responding to brighter light. Secondly, if the steady background light intensity is increased, the receptor intensity response curve shifts to the high intensity range. In FIG. 1, taken from FIG. 7.14c of the Retina: An Approachable Part of the Brain, by J. E. Dowling, Harvard University Press, 1987, the V-log I curves representing the peak responses of gecko photoreceptors are illustrated for a dark adapted state (DA) and first and second background intensities log I=-4.2 and -2.2. In photographic terms, the receptor essentially changes its film speed according to the background light intensity. If the background light is weak, it uses a high speed film. If the background is very bright, it changes to a very low speed film. However, this is different from photography because the change of film speed occurs locally on the same image, i.e., the retina uses different speed films at different parts of an image. Applying the same idea to printing photographic images, one can change the effective paper speed at different parts of an image by computers. This is the central idea of the present invention.
There are two major questions one has to answer at this moment:
What size is the so called background that the photoreceptors are adapting to? PA1 What characteristic of the background determines the adaptation state? Is it the average irradiance, the average log irradiance, or something else?
To answer the first question, one needs to understand roughly what adaptation does to the original scene irradiance image. Basically, the operation of adaptation is to take something out of the input stimulus, and what is taken out depends on what is being adapted to. If the adaptation is to the average irradiance of the surrounding background, i.e., the average irradiance is used to shift the intensity response curve of the photoreceptor without changing the curve shape, then the effect of adaptation is to reduce the incident light intensity by an amount which is a function f of the average irradiance of the adapting background. If f is a linear function, then the effect of adaptation is effectively a high pass filter. The magnitudes of the low frequency components of the input image are decreased. To be locally adaptive, the background cannot be the whole image. On the other hand, it has to cover a reasonably large area so that the cell will not "adapt out" the very fine image detail. In the extreme case, if a cell can adapt fully to a very tiny spot of light just covering its size, every cell will have an identical response, and there will be no image left. In terms of spatial frequency content, the smaller the area of the adapting background is, the higher the affected frequencies will be. Presumably, the compromise must be made so that the trade off between the visible image detail and the adaptive dynamic range is optimal for the organism's survival. Any optical system is limited by diffraction and aberration on the high spatial frequency end. For visual perception, there is also good reason to be insensitive to the low frequency variation so that slow changes in illumination and surface nonuniformity will not interfere with the perception of a physical object as a whole.
If one assumes that the low frequency response of the human visual system is determined by local adaptation along the visual pathway, including stages beyond the photoreceptor (e.g., lateral inhibition in the neural pathway), then one can get a good measure of how large an equivalent area a receptor cell takes as its adapting background by looking at the data for the human visual contrast sensitivity function (CSF). For a human eye with a 2.5 mm pupil looking at scenes of high luminance, the peak sensitivity is at about 5-8 cycles/degree. At 2.25 cycles per degree the sensitivity is roughly half the peak value. It will be seen later from the data set out in the Description of the Preferred Embodiment that the optimal adapting field size for the dynamic range adjustment of the present system has a close relation to these numbers.
To answer the second question, one has to hypothesize the mechanism in the center and its interaction with the surround along the visual pathway. If the strength of mutual inhibition depends on the output from each of the interacting neurons, and the neuron's response is proportional to the logarithm of the incident light irradiance, then what determines the adaptation is more likely to be the average log irradiance than the average irradiance. The input response function of a neuron or a photoreceptor is usually nonlinear, and it seems that for a relatively large range of input the photoreceptor response is roughly proportional to the logarithm of the input intensity, rather than to the intensity itself. Based on this type of reasoning, the average density of the surround will be used to control the local adaptation in the dynamic range adjustment of the present system. However, it is not clear from a physical or a mathematical point of view whether average density is definitely a better choice than average exposure.
Density space as used in this invention refers to quantities which are linearly proportional to the logarithm of the scene radiance. For example, the characteristic curves of a photographic film are usually expressed as the film density as functions of the logarithm of film exposure. Expressing color values in the density space has the advantage of making the chrominance values independent of the intensity of the scene radiance, because the difference of the logarithms of the two quantities remains unchanged when quantities are multipled by a constant factor. Therefore, we do not have to correct for color changes when the luminance component is adjusted.