1. Field of the Invention
The present invention relates to an image pickup method and apparatus, adapted to synthesize a plurality of images having been acquired by picking up or sensing an object under different exposure conditions, respectively, to produce an image having an excellent gradation reproducibility, and more particularly, to an image pickup method and apparatus suitable for application to a video camera, still camera, monitor camera, on-vehicle camera, etc. Also, the present invention relates to an image processing method and apparatus, adapted to receive a plurality of images different in exposure from each other, acquired under different exposure conditions, and synthesize the plurality of images to produce a synthetic image having an excellent gradation reproducibility.
2. Description of the Related Art
For acquisition of a plurality of images different in exposure from each other under different exposure conditions, various exposure control methods have so far been proposed. One typical example of them is a time-shared exposure control method in which a CCD (charge coupled device) is used as an image sensing device whose electronic shutter is used to change an exposure time to sense a plurality of images in a time-sharing manner. The principle of this time-shared exposure control method will be described with reference to FIG. 1. In FIG. 1, the horizontal axis indicates an elapsed time while the vertical axis indicates a charge storage in the image sensing device. This exposure control method is such that similarly to an ordinary image sensing, a charge is stored and read during a field interval and a following vertical blanking interval is utilized to store and read a charge again. This time-shared exposure control permits to provide two images different by one field interval in exposure time from each other.
Another typical example of the conventional exposure control methods for sensing a plurality of images different in exposure from each other is a space-shared exposure control method. This space-shared exposure control is shown in FIG. 2. As shown, this method is such that neutral density (ND) filters different in transmittance from each other are disposed on pixels, respectively, on image sensing devices to acquire a plurality of images with different exposures in a space-sharing manner. This space-shared exposure control also permits to provide a plurality of images different in exposure from each other.
A still another typical example of the exposure control methods of acquiring a plurality of images different in exposure from each other by sensing an object with different exposures is a method of controlling the exposure by multiple image sensing devices as shown in FIG. 3. As shown, this exposure control method is such that a plurality of image sensing devices is used and ND filters different in transmittance from each other are disposed on the incident faces of the image sensing devices, respectively, to sense a plurality of images. By controlling exposure with the aid of multiple image sensing devices, it is possible to provide a plurality of images different in exposure from each other without any reduction of spatial resolution of the images.
For synthesis of a plurality of images different in exposure from each other, it has been proposed to multiply each image by a factor corresponding to a ratio between the exposures of the images and then make a selection between the images on the basis of a threshold. The principle of this synthesis method will be described with reference to FIG. 4. In FIG. 4, the horizontal axis indicates incident quantities of light upon the image sensing device while the vertical axis indicates levels of output signals from the image sensing devices, that is, pixel levels of the sensed images. In FIG. 4, an image yL acquired by exposure for a long time is indicated with a straight line inclined a large angle, and in a region where the incident quantity of light is above a certain level, the level of output signal is constant because of the saturation of the image sensing devices. Also, an image yS acquired by exposure for a short time is indicated with a straight line inclined a small angle, and the output signal is saturated with the incident quantity of light being larger than that of the image yL. In this image synthesis method, first an output signal corresponding to the image yS acquired by exposure for the short time is multiplied by a factor g so that the inclination of the straight line indicating the image yS is made to coincide with that of the straight line indicating the image yL. Thereafter, reference is made to an output signal corresponding to the image yL. When the level of the output signal is higher than a threshold TH, the output signal corresponding to the image yL is selected. Also, reference is made to the output signal corresponding to the image yL. When the level of the output signal is higher than the threshold TH, the output signal corresponding to the image yS is selected. Thus, the plurality of images different in exposure from each other is synthesized to produce a synthetic image. Suppose here that the level of the output signal corresponding to the synthetic image is y′. Then, the image synthesis is given by the following equation (1):
                              y          ′                =                  {                                                                                          yL                    ⁢                                                                                  ⁢                    …                    ⁢                                                                                  ⁢                    yL                                    ≤                  TH                                                                                                                          yS                    ×                    g                    ⁢                                                                                  ⁢                    …                    ⁢                                                                                  ⁢                    yL                                    >                  TH                                                                                        (        1        )            
The factor g by which the output signal corresponding to the image yS is multiplied is a ratio between exposure times of the images, and it is given by the following equation (2):
                    g        =                              T            long                                T            short                                              (        2        )            where Tlong and Tshort indicate a long exposure time and a short exposure time, respectively. When the exposure time ratio is N times larger, the dynamic range of the synthetic image will be multiplied by N.
Note that when there are images having been acquired with more three kinds of exposure time, the image synthesis given by the equation (1) should be done first for an image whose exposure time is the longest, then for an image whose exposure time is the next longest, and so forth.
In the above, the synthesis of images whose exposure is controlled by changing the exposure time, has been described with reference to FIG. 1. Also, an image can be produced in the similar manner by synthesizing images acquired with their exposure controlled by the method shown in FIG. 2 or 3.
For compressing a synthetic image produced as in the above and which has a wide dynamic range to an extent depending upon the capability of a transmission system or display apparatus which will output the image, there has been proposed a method of converting the level of each pixel of an input image using a level conversion function having an input vs. output relation shown in FIG. 5. This compression will be referred to as “level conversion” hereinafter. In FIG. 5, the horizontal axis of the level conversion function indicates a pixel level 1 of an input image while the vertical axis indicates a pixel level T(1) of an output image having been converted in level. Also in FIG. 5, Linmax indicates a maximum level each pixel of the input image can take, Loutmax indicates a maximum level each pixel of the output image can take. In this level conversion, the whole dynamic range is compressed with sufficient contrasts secured at low and middle input levels, respectively, at the cost of a contrast at a high input level which is higher than 1 k, for example.
In addition to the above image compression, there has also been proposed an image compressing method in which the level conversion function is varied adaptively correspondingly to a frequency distribution of pixel level of an input image. As an example of this method, there is available a method called “histogram equalization”. The principle of this histogram equalization will be described with reference to FIG. 6. In FIG. 6, the horizontal axis indicates an image level 1 of input image while the vertical axis indicates a frequency of pixel level of the input image. Also, Fmax in FIG. 6 indicates a maximum cumulative frequency of pixel level of the input image, which is a total number of pixels for used to calculate a frequency.
In this image compression method, first a frequency distribution H(1) of the pixel level 1 of an input image is produced, and then a cumulative frequency distribution C(1) is produced using the following equation (3):
                              C          ⁡                      (            l            )                          =                              ∑                          k              -              0                        l                    ⁢                                          ⁢                      II            ⁡                          (              k              )                                                          (        3        )            
In this image compression method, the following equation (4) is used to normalize the cumulative frequency distribution C(1) to a range of level the output image can take to provide a level conversion function T(1). Using the level conversion function T(1), the image compression method permits to secure a sufficient contrast in a region defined by a pixel level whose output frequency is high, namely, a region having a large area, thereby compressing the whole dynamic range.
                              T          ⁡                      (            l            )                          =                                            C              ⁡                              (                l                )                                                    F              ⁢                                                          ⁢              max                                ×          L          ⁢                                          ⁢          max                                    (        4        )            
In case a color filter having a color layout as shown in FIG. 7 for example is disposed on the front of an image sensing device to sense or pick up a color image, the image sensing device will provide an output signal in which a frequency-modulated color signal is superposed on a brightness signal as shown in FIG. 8. A method of synthesizing a plurality of color images sensed by such an image sensing device to produce a synthetic image and compressing the synthetic image, will be described herebelow.
In this method, an image signal sensed with each exposure is separated into a brightness signal and color signal on the basis of the following equation (5):
                                                        y              =                            ⁢                                                LPF                  y                                ⁡                                  (                  x                  )                                                                                                        c              =                            ⁢                                                LPF                  c                                ⁡                                  (                                                            v                      i                                        ×                    x                                    )                                                                                                                        v                i                            =                            ⁢                              {                                                                                                    1                        ⁢                                                                                                  ⁢                        …                        ⁢                                                                                                  ⁢                                                  i                          :                          even                                                                                                                                                                                                  -                          1                                                ⁢                                                                                                  ⁢                        …                        ⁢                                                                                                  ⁢                                                  i                          :                          odd                                                                                                                                                                            (        5        )            where x indicates an image signal in which a brightness signal and a color signal are mixed together, c indicates a separated color signal, LPFy indicates a low-pass filter to separate the brightness signal, and LPFc indicates a low-pass filter to separate the color signal.
The separated brightness signal y is produced by synthesis and compressed by any of the above methods. On the other hand, the separated color signal c is produced by synthesis on the basis of the following equation (6) with reference to the size of a brightness signal acquired with a large exposure:
                              c          ′                =                  {                                                                                          cL                    ⁢                                                                                  ⁢                    …                    ⁢                                                                                  ⁢                    yL                                    ≤                  TH                                                                                                                          cS                    ×                    g                    ⁢                                                                                  ⁢                    …                    ⁢                                                                                  ⁢                    yL                                    >                  TH                                                                                        (        6        )            where yL and cL indicate brightness and color signals acquired with large exposures, cS indicates a color signal acquired with a small exposure, and g indicates a ratio between exposures as shown in the aforementioned equation (2).
The synthetic color signal is compressed on the basis of the following equation (7) in such a manner as not to vary in exposure ratio relative to the brightness signal:
                                          c            ″                    ⁡                      (                          i              ,              j                        )                          =                                                            y                ″                            ⁡                              (                                  i                  ,                  j                                )                                                                    y                ′                            ⁡                              (                                  i                  ,                  j                                )                                              ×                                    c              ′                        ⁡                          (                              i                ,                j                            )                                                          (        7        )            
An image xL acquired by exposure for a long time in an optimum condition, and an image xS acquired by exposure for a short time also in an optimum condition, are shown in FIG. 9A for example. As shown, the image xL acquired by exposure for the long time has a sufficient contrast in a low-level region R2 but its level is saturated in a high-level region R1. On the other hand, in the image xS acquired by exposure for the short time, the saturation level is not reached even in the high-level region R1 but no sufficient contrast cannot be secured in the low-level region R2.
For synthesis of two such images xL and xS and compression of a synthetic image thus produced, first the above-mentioned synthesis method is applied to synthesize, as shown in FIG. 9B the image xL itself acquired by exposure for the long time and selected in the low-level region R2, and a product resulted from multiplication of an exposure time ratio as given by the equation (2) by the image xS acquired by exposure for the short time and selected in the high-level region R1. The synthetic image thus obtained is compressed by the method shown in FIG. 5 for example, to produce an image y having a sufficient contrast in each of light and dark regions thereof as shown in FIG. 9C.
An image signal sensed in practice contains additional components developed due to a light diffraction at the boundary of an object to be sensed and a light reflection and scattering in the optical system. Thus, the sensed image has an increased level in the entirety thereof or in a dark region thereof adjacent to a light region, so there will result in an impression that the black level will be increased, as shown in FIG. 10A.
The above phenomenon is called “flare”. An image sensed by exposure for a longer time will contain more flare spots. By synthesizing images containing such flare spots by the aforementioned method, a resultant image will be as shown in FIG. 10B. Since in the image synthesis, an image sensed by exposure for a short time will be amplified, a synthetic image produced by synthesis of such images will have flare spots nearly evenly distributed in the entirety thereof. For compressing the synthetic image by the aforementioned compression method, normally the higher level region of the image will be compressed more as shown in FIG. 5 for example, so that the compressed image will have more flare spots in the dark region thereof than in the light region as shown in FIG. 10C, thus the black level will be emphasized more.
Needless to say, such a phenomenon will take place in a plurality of images acquired by controlling the exposure time as having been described with reference to FIG. 1 as well as in images acquired by any other exposure control method having previously been described with reference to FIG. 2 or 3.
As having been described in the foregoing, the conventional image pickup methods including the aforementioned exposure control method, image synthesis method and image compression method are disadvantageous in that synthetic images and compressed images are not natural because a plurality of original images to be synthesized are acquired in different conditions, respectively.
The conventional image pickup methods are also disadvantageous in that since images acquired with larger exposure contain more flare spots, an image produced by synthesis of such sensed images has the level in the dark region thereof relatively increased as caused by the flare spots and thus a resultant image as a whole will be whitish.