1. Field of the Invention
The present invention relates to a dynamic range compression apparatus, a dynamic range compression method, a program, an integrated circuit, and an imaging apparatus, and particularly relates to a dynamic range compression apparatus, dynamic range compression method, program, integrated circuit, and imaging apparatus in which an image signal with a variable dynamic range is inputted.
2. Description of the Related Art
Imaging apparatuses, such as digital cameras that shoot still images and digital video cameras that shoot videos, use an optical system to control exposure, and use a CCD image sensor, a CMOS image sensor, or the like to convert the image formed by the optical system into an electrical signal through photoelectric conversion and obtain an analog image signal. Conventional imaging apparatuses process the obtained analog image signal using a circuit that performs analog front-end processing and the like, and convert the signal into digital image data through an A/D converter. This digital image data processed through gamma correction processing for video (for example, gamma correction processing where γ=0.45), knee processing, luminance/color difference conversion processing, and so on, is converted into a standardized format such as JPEG (Joint Photographic Experts Group) or the like, in the case of still image data, or MPEG (Moving Picture Experts Group), DV (Digital Video), or the like, in the case of video data. Having been converted into such a standardized format, the digital image data is recorded into various types of memory cards, hard disks, optical disks, magnetic tape, or the like.
Assuming the whitest point (in other words, the maximum brightness value when the image is displayed in a display device) set in the abovementioned standardized formats (image (video) formats) is 100%, a dynamic range (also called “D-range”) of 100% to 500% brightness is normally set for shooting in such a conventional imaging apparatus. Note that “a dynamic range of 100%” indicates that the range a signal value (for example, the brightness value) can take on is from 0% to 100%. In other words, “a dynamic range of 100%” means that the minimum signal value is 0% and the maximum signal value is 100%. Furthermore, the image sensor of the imaging apparatus is assumed to have a sensitivity (photosensitivity) capable of sufficiently handling the change in light intensity corresponding to a video (image) signal whose dynamic range is to be approximately 200% to 500% if photoelectric conversion is performed. The image sensor of the imaging apparatus is assumed, for example, to have a sensitivity (photosensitivity) sufficient for light converted into a high-luminance signal, such as with an image of the sky, clouds, or the like.
Users generally use such imaging apparatuses to shoot a variety of scenes, from somewhat dark indoor night shots to outdoor shots on clear days. Different peak values appear in the video (image) signals obtained by the image sensor of the imaging apparatus depending on the scene shot with the imaging apparatus. In other words, different peak values appear in the pixel values (the values of the pixels that form the image (values corresponding to the video (image) signal values)) within a single image (for example, the image in a single frame) formed by the video (image) signal obtained by the image sensor. For example, when a somewhat dark indoor night scene is shot using the imaging apparatus, the peak value of the pixel values in the captured image (the image obtained by the image sensor of the imaging apparatus) is low, whereas when an outdoor scene is shot on a clear day using the imaging apparatus, the peak value of the pixel values in the captured image is high. Video (image) signals obtained by the image sensor of the imaging apparatus, in which the peak values differ depending on the scene that was shot, are inputted into a signal processing unit in the imaging apparatus, located subsequent to the image sensor. That is, a video (image) signal with a variable D-range is inputted into the signal processing unit of the imaging apparatus.
The imaging apparatus uses the signal processing unit to compress video (image) signals with such variable and wide D-ranges into video (image) signals with a D-range of 100% or less, and output the resulting signals to a display device or record the signals in a recording medium. This type of compression processing, performed by an imaging apparatus, is called “D-range compression processing”. It should be noted that the imaging apparatus normally performs γ correction (for example, gamma correction processing where γ=0.45) prior to the D-range compression processing, and thus a video (image) signal with a D-range of, for example, 500% is converted into a video (image) signal with a D-range of approximately 200% through the γ processing where γ=0.45. Hereinafter, “D-range X %” (where X is an arbitrary number), or simply “X %” (where X is an arbitrary number) refers to the D-range of the video (image) signal following the γ correction processing.
Two types of conventional D-range compression processing, or the well-known auto knee processing and D-range compression processing performed by a visual processing apparatus as disclosed in Patent Documents 1 (JP 4126297B) and 2 (WO 2007/043460), will be described next.
<Auto Knee Processing>
FIG. 1 illustrates the input/output characteristics of auto knee processing.
Auto knee processing compresses the input D-range of input luminance signals Yin representing the luminance of each pixel in an input image (signals corresponding to the pixel values of each pixel in the input image) using a compression having input/output characteristics with a broken line form such as that shown in FIG. 1, and outputs output luminance signals Yout.
As shown in FIG. 1, the input/output characteristics of this broken line form include a low-luminance conversion portion LR (slope 1) (in the graph in FIG. 1, the region where Yin is 0% to 85% corresponds to this low-luminance conversion portion) and a high-luminance conversion portion HR (knee slope, or slope variance) (in the graph in FIG. 1, the region where Yin is greater than or equal to 85% corresponds to this high-luminance conversion portion HR), with a knee point (normally at approximately 85%) therebetween. Auto knee processing generally changes the slope of the high-luminance conversion portion HR in accordance with the peak input value so that the peak input value that varies depending on the shot scene (the peak value of the input luminance signal Yin) is always outputted as the maximum value of the D-range in the output luminance signal Yout. For example, when, as shown in FIG. 1, the peak input value Pin is A1, the characteristic curve (a straight line, in FIG. 1) of the high-luminance conversion portion HR in the input/output characteristics of the D-range compression processing is taken as the straight line L1. When the peak input value Pin is A2, the characteristic curve (a straight line, in FIG. 1) of the high-luminance conversion portion HR in the input/output characteristics of the D-range compression processing is taken as the straight line L2. Finally, when the peak input value Pin is A3, the characteristic curve (a straight line, in FIG. 1) of the high-luminance conversion portion HR in the input/output characteristics of the D-range compression processing is taken as the straight line L3. In this manner, auto knee processing changes the slope of the high-luminance conversion portion HR in accordance with the peak input value Pin.
With auto knee processing, an input luminance signal Yin of a medium-to-low luminance (0% to 85%), in which main subjects such as people exist, is converted by the characteristic curve (a straight line, in FIG. 1) of the low-luminance conversion portion LR (slope 1) into an output luminance signal Yout of a consistent brightness.
However, a high-luminance signal (85% to the peak input value) corresponding to regions such as the sky and clouds in a subject is compressed by the high-luminance conversion portion HR (less than the slope 1) so that the D-range of the output luminance signal falls within a range of 15%, from 85% to 100%, which leads to a marked degradation in the tone of the image formed by the output luminance signal. For this reason, there is a problem in that the contrast of the sky, clouds, and so on in an image formed by a luminance signal on which auto knee processing has been performed will drop considerably, as shown in FIG. 5A.
<D-Range Compression Processing by Visual Processing Apparatus Disclosed in Patent Documents 1 and 2 (Visual Knee Processing)>
D-range compression processing based on the visual characteristics of humans has therefore been disclosed, as in Patent Documents 1 and 2, in order to solve the problem of drops in contrast. This will be described using FIGS. 2 to 4.
First, FIG. 2 is a diagram illustrating the brightness contrast characteristic, which is one of the stated visual characteristics.
The small circles located within the large circles on the left and right are both of the same brightness, but the small circle in the center of the large circle on the left appears brighter because its surroundings are dark, whereas the small circle in the center of the large circle on the right appears darker because its surroundings are bright. Humans thus sense brightness and contrast based on an object's surroundings, rather than sensing brightness directly. This is called the brightness contrast characteristic.
Next, a visual processing apparatus 10 that performs a tone conversion process based on this brightness contrast characteristic, as disclosed in Patent Documents 1 and 2, will be described.
FIG. 3 is a block diagram illustrating the visual processing apparatus 10. The visual processing apparatus 10 is configured of a spatial processing unit 101 and a visual processing unit 102 achieved using a two-dimensional LUT.
First, the spatial processing unit 101 calculates a surrounding average luminance (signal) Yave for the input luminance (signal) Yin.
Here, “surrounding average luminance” refers to the average luminance value of pixels present in an image region of a predetermined area formed with a pixel of interest, which is the target of the processing, at its center, in an image formed by the input luminance signal Yin; for example, when the image size is 1920×1080 pixels, the average luminance value of pixels present in a region (image region) of approximately 400×240 pixels with the pixel of interest at the center corresponds to this “surrounding average luminance”.
Next, multiple tone conversion curves (tone conversion characteristic curve data that determines the tone conversion characteristics) that differ for each surrounding average luminance (signal) Yave are stored in the visual processing unit 102, and the input luminance (signal) Yin undergoes tone conversion using the tone conversion curve that corresponds to that surrounding average luminance (signal) Yave based on a 2D-LUT (two-dimensional look-up table). The visual processing unit 102 then outputs the output luminance (signal) Yout obtained as a result of the tone conversion.
Making various changes to the input/output characteristics of the visual processing unit 102 (when the visual processing unit 102 is achieved using a 2D-LUT (two-dimensional look-up table), making various changes to the input/output characteristics data of that 2D-LUT) makes it possible for the visual processing apparatus 10 to perform D-range compression processing, dark region correction processing, and so on while maintaining the contrast, as well as various tone conversion processes, such as contrast enhancement processing that maintains the overall sense of brightness.
The case where the visual processing unit 102 is used in the D-range compression processing will now be described in detail. The D-range compression processing performed by the visual processing unit 102 is hereinafter called “visual knee processing”, referring to knee processing that is based on visual characteristics.
FIG. 4 illustrates the input/output characteristics of a visual processing unit 11 during visual knee processing.
In visual knee processing, the D-range of the input luminance (signal) Yin is compressed, based on the brightness contrast characteristic, using a tone conversion curve (selected from tone conversion curves C1 to Cn) that converts the tone of the input luminance signal Yin to a lower value the higher Yave is, and the output luminance (signal) Yout is outputted. Here, the tone conversion curve C1 represents a D-range compression curve selected when the surrounding average luminance Yave is less than 85%, whereas a tone conversion curve C2, a tone conversion curve Cm, and a tone conversion curve Cn represent D-range compression characteristic curves (tone conversion curves), where lower curves in the graph in FIG. 4 are selected the higher the surrounding average luminance Yave is. This visual knee processing will be compared to the auto knee processing using FIG. 5.
FIG. 5A is a diagram illustrating an image processed using auto knee processing. FIG. 5B, meanwhile, is a diagram illustrating an image processed using visual knee processing.
With auto knee processing, the input luminance (signal) Yin is processed using a single type of curve (broken line) (the broken line AK, in FIG. 4) (tone conversion curve), and the output luminance (signal) Yout is obtained. In other words, with auto knee processing, the entire input image (the entire image shown in FIG. 5A) is processed using a single type of tone conversion curve (broken line). In auto knee processing, higher luminance values (signal values) in the input luminance (signal) Yin (high-luminance signals) are more strongly compressed in order to maintain a sufficient brightness in the image formed by the output luminance signal Yout following the auto knee processing, with respect to image regions of a medium-to-low brightness in the image formed by the input luminance (signal) Yin, such as regions where people are present. Accordingly, as shown in FIG. 5A, the contrast drops in the sky portion of the image, which is an image region formed by high-luminance signals.
However, with visual knee processing, tone conversion processing is performed using different tone conversion curves depending on whether the image region is a bright region or a dark region according to the surrounding average luminance, as shown in FIG. 5B. In other words, with visual knee processing, different tone conversion curves are selected for each brightness region in the image. Portions (image regions) containing main subjects, such as people, that have medium-to-low luminance undergo tone conversion using the tone conversion curve C1. Through this, the post-visual knee processing image maintains the brightness of the portions (image regions) containing main subjects, such as people, that have medium-to-low luminance. Meanwhile, image region formed by high-luminance signals, such as the sky, clouds, and so on, have a high surrounding average luminance, and therefore undergo tone conversion using, for example, the tone conversion curve Cm. In most of the regions in which the value of the input luminance signal Yin is high (that is, regions where Yin in FIG. 4 is 85% to 200%), the tone conversion curve Cm has a slope greater than that of the curve (in FIG. 4, a line) in the high-luminance conversion portion of the tone conversion curve used in the auto knee processing, and has a low post-tone conversion value (Yout value) (output value). For this reason, a high-luminance signal that has been inputted (for example, an input luminance signal Yin of 85% to 200%) is compressed to an output luminance signal Yout with a wider output D-range (for example, 50% to 100%). This suppresses a drop in contrast in images obtained through the visual knee processing.
For example, in visual knee processing, when the input luminance signal Yin of a pixel of interest is B1 and the luminance surrounding that pixel of interest is high, the input luminance signal Yin corresponding to that pixel of interest is converted into an output luminance signal Yout value D1 based on the tone conversion curve Cm. Meanwhile, in auto knee processing, assuming the same conditions, the input luminance signal Yin corresponding to the pixel of interest (that is, B1) is converted to an output luminance signal Yout value E1 based on the tone conversion curve AK represented by the broken line. In other words, in this case, the value obtained through conversion by the visual knee processing (the value of Yout) is smaller, and thus the visual knee processing can obtain an output luminance signal Yout with a wider output D-range than can be obtained through the auto knee processing. In addition, the slope of the tone conversion curve Cm in the portion indicated by R2 in FIG. 4 is greater than the slope of the broken line AK in the portion indicated by R1 in FIG. 4. For this reason, drops in contrast will be suppressed more in images obtained through the visual knee processing than in images obtained through the auto knee processing.
Thus, visual knee processing does not perform D-range compression processing (tone conversion processing) on the entire image using a single D-range compression curve (line) (tone conversion curve) as is the case with auto knee processing; rather, a predetermined output D-range compression curve (tone conversion curve) is selected from among multiple D-range compression curves (tone conversion curves) based on the surrounding average luminance of a pixel of interest, and D-range compression processing (tone conversion processing) is performed using that curve. For this reason, visual knee processing makes it possible to achieve D-range compression processing (tone conversion processing) that enables the independent control of brightness for each luminance region (image region) (for example, for each bright region, each dark region, and so on). In other words, with visual knee processing, it is possible to increase the slope of the tone conversion curve applied to high-luminance regions in which the surrounding average luminance is high while also maintaining the brightness of main subjects. This makes it possible to greatly improve the tone of high-luminance regions (high-luminance image regions) in images obtained through visual knee processing.
However, visual knee processing is a fixed process based on an LUT (look-up table), and does not have a function for linking the input/output characteristics with the peak input value as in auto knee processing (called an “auto knee function” hereinafter). This results in the following problems.
FIGS. 6A and 6B are diagrams illustrating problems that arise when there is no auto knee function.
These diagrams assume a maximum input D-range of 200% (post γ correction) for the 2D-LUT (two-dimensional look-up table) in the visual processing unit 102; FIGS. 6A and 6B illustrate operations performed by the visual processing apparatus 10 and the processed images in the case where the peak input value (peak value of the input luminance signal Yin) is greater than or equal to 200% and less than 200%, respectively.
(1) When the peak input value Pin≧200%, it is necessary to clip the input luminance (signal) Yin at 200% in advance, as shown in FIG. 6A. This means that any tones above 200% are lost, and thus there is a problem in that the sky portion (image region) are washed out, as shown in the processed image Imgl in FIG. 6A.
(2) However, when the peak input value Pin≦200%, the peak input value Pin of the input luminance (signal) Yin is converted using one of the tone conversion curves C1 to Cn, as shown in FIG. 6B. Regardless of which tone conversion curve is used, the output (output peak value) Pout obtained after the tone conversion performed by the visual processing unit 102 on the peak input value Pin is a value less than 100%, and thus the output D-range cannot be used in its entirety. In other words, in this case, the high-luminance signal (the input luminance signal Yin corresponding to the high-luminance image region) is compressed more than necessary, and thus the tone of high-luminance portions (for example, the sky portion in FIG. 6B) is lost in the image formed by the output luminance signal Yout outputted from the visual processing unit 102. For this reason, there is a problem in that the sky portions of the image are dark (resulting in what is called a “subdued image”), as shown in the processed image Img2 in FIG. 6B.
There is a conceivable method for solving these problems, where, for example, multiple LUTs with maximum input D-ranges (equivalent to the maximum D-range of the luminance signal Yin inputted into the visual processing unit 102) of 100%, 200%, 400%, 800%, and so on are created and the LUTs are switched dynamically in accordance with the peak input value Pin. However, in such a case, a two-dimensional LUT circuit for the 800% setting is necessary, leading to problems in that the circuit scale significantly increases in size, problems with time lag during the LUT switches, and so on, and thus employing this method is not realistic.
The present invention solves these problems, and it is an object thereof to provide a dynamic range compression apparatus, a dynamic range compression method, a program, an integrated circuit, and an imaging apparatus capable of consistently obtaining an image signal compressed at the full output range based on the peak value in an image formed by an input image signal (that is, capable of performing dynamic D-range compression processing in accordance with the peak value in the image), even in the case where an image signal with a variable D-range is inputted, by providing an auto knee function in visual knee processing.
It is a further object of the present invention to achieve an auto knee function that maintains contrast by controlling the gain using a surrounding average luminance signal.