1. Field of the Invention
The present invention relates to an image processing apparatus, an image processing method, and a computer program. More particularly, the present invention relates to an image processing apparatus, an image processing method, and a computer program that make it possible to reduce, when one output image is generated in a digital still camera or the like by interpolation performed by using plural input images, unnaturalness of the output image caused by a moving object, which is a moving subject, in an input image.
2. Description of the Related Art
For example, when a user photographs an image with a digital still camera or the like held by the hand, if an exposure time inevitably becomes long because an amount of light is insufficient, an image photographed by the digital still camera may be blurred because of hand shake. In order to prevent such a blurred image from being formed, there is a method of obtaining an image without a blur by, so to speak, superimposing plural dark images photographed with an exposure time short enough for preventing the image from being affected by hand shake (see, for example, JP-A-2005-38396).
In the method disclosed in JP-A-2005-38396, plural times of photographing are temporally continuously performed by a digital still camera to obtain temporally continuous plural photographed images as plural input images. With one of the plural photographed images set as a reference image, overall movements of the respective plural photographed images with respect to the reference image are calculated. Positioning of the plural photographed images is performed on the basis of the movements. One image (an output image) is obtained by superimposing (combining) the plural photographed images after the positioning.
FIG. 1 shows a method of obtaining an output image in the case in which there are two photographed images.
In FIG. 1, two photographed images P1 and P2 are photographed images continuously photographed by a digital still camera. In the photographed images P1 and P2, positions of subjects deviate from each other because of hand shake or the like.
When, for example, the photographed image of the two photographed images P1 and P2 is set as a reference image, movements of the respective photographed images P1 and P2 with respect to the reference image are calculated. Positioning of the photographed images P1 and P2 is performed on the basis of the movements to superimpose the subjects in the two photographed images P1 and P2. An output image Pout is obtained by superimposing the photographed images and P1 and P2 after the positioning.
In this case, the plural photographed images are photographed with a short exposure time. However, the plural photographed images may be photographed with proper exposure. When the plural photographed images are photographed with the proper exposure and superimposed as described above, it is possible to obtain an output image with a nigh S/N (signal to Noise ratio).
When the plural photographed images are superimposed to obtain one output image as described, above, a moving object, which is a moving subject, may be in the plural photographed images.
When the moving object as the moving subject appears in the plural photographed, images, if a difference between a pixel value of a moving object portion and a pixel value of a background portion seen off and on according to the movement of the moving object is large, granularity noise called zipper noise and false colors may appear in an output image. As a result, the output image may be an unnatural image. The background (portion) means a portion other than the portion where the moving object appears.
FIG. 2 shows an output image obtained by superimposing plural photographed images.
The plural photographed images used for obtaining the output image in FIG. 2 are photographed image obtained by continuously photographing a scene of a running car on a road plural times at fixed time intervals. The moving car appears in the output image in FIG. 2 at fixed intervals.
FIG. 3 shows an enlarged image obtained by enlarging a part of the output image in FIG. 2.
In the enlarged image in FIG. 3, zipper noise appears particularly conspicuously near edges.
The zipper noise (and the false colors) appears in the output image obtained, by superimposing the plural photographed images including the moving object as described above because of a principle described below.
When the positioning of the plural photographed images is performed and the plural photographed images after the positioning are superimposed, to generate one output image as described above, positions of pixels of the plural photographed images after the positioning do not always coincide with positions of pixels of the output image.
Therefore, if a pixel for which a pixel value is calculated among the pixels of the output image is referred to as pixel of interest, superimposition of the plural photographed images after the positioning is performed by interpolating the pixel value of the pixel of interest using, among the pixels of the plural photographed images (herein after also referred to as photographed pixels as appropriate) after the positioning, pixel values of photographed pixels in positions near the position of the pixel of interest.
Examples of a method of the interpolation of the pixel value of the pixel of interest include a method of performing a simple addition for directly adding up pixel values of one or more photographed pixels in positions near the position of the pixel of interest and a method of performing interpolation using pixel values of one or more photographed pixels in positions near the position of the pixel of interest and an interpolation function.
The interpolation function is a function that changes according to a relative position of the photographed pixel used for the interpolation with respect to the pixel of interest of (a distance between the pixel used for the interpolation and the pixel of interest). For example, a linear function represented by a primary expression, a cubic function, and the like are used. The simple addition is equivalent to using a function with a value of 1 as the interpolation function regardless of the (relative) position of the photographed pixel used for the interpolation.
When it is assumed that, for example, an imaging element of the Bayer array is adopted as an imaging element of the digital still camera used for photographing of plural photographed images, respective pixels of a photographed image obtained from the imaging element has pixel values shown in FIG. 4.
FIG. 4 shows a photographed image obtained from the imaging element of the Bayer array (herein after also referred to as Bayer image as appropriate).
In the Bayer image, the respective pixels have one color signal (color component) among an R (Red) signal, a G (Green) signal, and a B (Blue) signal as a pixel value. A certain pixel and pixels adjacent to the pixel in the four directions, above and below and the left and right, have different color signals as pixel, values.
Here, it is assumed that two Bayer images are used as plural photographed images to generate an output image. Moreover, it is assumed that, as a result of performing positioning of the two photographed images as the Bayer images, pixels of one of the two photographed images after the positioning are in positions deviating by one pixel in the horizontal (left to right) direction and deviating by one pixel in the vertical (up to down) direction with respect to pixels of the other photographed image.
Moreover, it is assumed that, for example, a linear function represented by Equation (1) is adopted as an interpolation function.
                              f          ⁡                      (            z            )                          =                  {                                                                      az                  +                  1                                                                              (                                                            -                      a                                        <                    z                    ≤                    0                                    )                                                                                                                          -                    az                                    +                  1                                                                              (                                      0                    ≤                    z                    <                    a                                    )                                                                                    0                                                              (                                      a                    ≤                                                                z                                                                              )                                                                                        (        1        )            
In Equation (1), “z” indicates a position in the horizontal direction or the vertical direction of a photographed pixel with the position of the pixel of interest set as a reference (an x coordinate or a y coordinate of the photographed pixel in an xy coordinate system with the position of the pixel of interest set as an origin).
In Equation (1), “a” is a constant satisfying an expression 0<a. As the constant “a”, for example, an appropriate value is used out of values in a range satisfying an expression 0<a≦2. Here, in order to simplify the explanation, it is assumed that the constant “a” is 1.
FIGS. 5 and 6 show pixel values of pixels on one line after the positioning of the two photographed images as the Bayer images and pixel values of pixels of the two photographed images after the positioning and pixels of an output image generated by interpolation performed by using an interpolation function f(z) represented by Equation (1).
A first figure from the top of FIG. 5 shows pixel values of pixels of a first photographed image P1 among the two photographed images after the positioning. A second figure from the top of FIG. 5 shows pixel values of pixels of a second photographed image P2.
As described above, pixels of one photographed image of the two photographed images P1 and P2 after the positioning are in positions deviating by one pixel in the horizontal and the vertical directions, respectively, with respect to pixels of the other photographed image. Therefore, in FIG. 5, positions of the pixels of the second photographed image P2 after the positioning deviate by one pixel with respect to positions of the pixels of the first photographed image P1 after the positioning.
In FIGS. 5 and 6, for example, only the G signals among the three color signals of P., G, and B signals are shown as the pixel values. Pixels having the G signals among the pixels of the two photographed, images P1 and P2 as the Bayer images are herein after referred to as G pixels as appropriate.
As shown in FIG. 4, the G pixels appear every other pixel in one line of the Bayer image. The positions of the pixels of the second photographed image P2 after the positioning deviate by one pixel with respect to the positions of the pixels of the first photographed image P1 after the positioning. Therefore, in the first and second figure from the top of FIG. 5, the G pixels are arranged every other pixel. Positions of the G pixels in the second figure from the top of FIG. 5 deviate by one pixel with respect to the G pixels in the first figure from the top of FIG. 5.
A third figure from the top of FIG. 5 shows pixel values of an output image Pout generated by interpolation performed by using the two photographed images P1 and P2 after the positioning and the interpolation function f(z) of Equation (1).
When a subject in the two photographed images P1 and P2 is not moving, the identical subject appears in the G pixels of the photographed image P1 after the positioning and the G pixels of the photographed image P2 after the positioning close to the G pixels. Thus, pixel values (G signals) of the G pixels of the photographed image P1 after the positioning and the G pixels of the photographed image P2 after the positioning take substantially the same values.
When attention is paid to only pixels on one line shown in FIG. 5 in the two photographed images P1 and P2, as described above, the constant “a” of the interpolation function f(z) of Equation (1) is 1. Thus, a photographed pixel used for interpolation of the G signal of the pixel of interest using such an interpolation function f(z) is a G pixel in a range of a horizontal direction distance of 1 or less from the position of the pixel of interest, i.e., one G pixel in a position nearest from the pixel of interest among the G pixels of the two photographed images P1 and P2 after the positioning.
As described above, the photographed pixel used for the interpolation of the G signal of the pixel of interest is only one G pixel in the position nearest from the pixel of interest among the G pixels of the two photographed images P1 and P2 after the positioning. However, in FIG. 5, the G signals of the G pixels of the photographed image after the positioning and the G signals of the G pixels of the photographed image P2 after the positioning near pixels in a local small area “r” of the output image Pout take substantially the same values.
Therefore, interpolation for generating (obtaining) G signals of plural pixels in the local small area “r” of the output image Pout is performed, for any one of the plural pixels, using the G signals of substantially the same values among the G pixels of the photographed images P1 and P2 after the positioning. Thus, as shown in the third figure from the top of FIG. 5, all the G signals of the plural pixels in the local small area “r” of the output image Pout take substantially the same values.
On the other hand, FIG. 6 shows pixel values of pixels on one line of the two photographed images P1 and P2 after the positioning and pixel values of pixels of the output image generated by the interpolation performed by using the two photographed images P1 and P2 after the positioning and the interpolation function f(z) represented by Equation (1) in the case in which subjects in the two photographed images P1 and P2 are moving.
A first figure from the top of FIG. 6 shows pixel values of pixels of the first photographed image P1 of the two photographed images after the positioning. A second figure from the top of FIG. 6 shows pixel values of pixels of the second photographed image P2.
In FIG. 6, as in FIG. 5, in the first and second figure from the top of FIG. 6, G pixels are arranged every other pixel. Positions of the G pixels in the second figure from the too of FIG. 6 deviate by one pixel with respect to the G pixels in the first figure from the top of FIG. 6.
A third figure from the top of FIG. 6 shows pixel values of the output image Pout generated by interpolation performed by using the first photographed image P1 in the first figure from the top of FIG. 6, the second photographed image P2 in the second figure from the top of FIG. 6, and the interpolation function f(z) of Equation (1).
As described above, in FIG. 6, the subject in the two photographed images P1 and P2 is moving. Therefore, different subjects (e.g., a moving subject and a background seen because of the movement of the subject) appear in the G pixels of the photographed image P1 after the positioning and the G pixels of the photographed image P2 after the positioning close to the G pixels. As a result, the G signals of the G pixels of the photographed image P1 after the positioning and the G signals of the G pixels of the photographed image P2 after the positioning near the G pixels take values significantly different from each other.
As explained with reference to FIG. 5, the photographed pixel used for the interpolation of the G signal of the pixel of interest using the interpolation function f(z) with the constant “a” of 1 in Equation (1) is only the G pixel in the range of the horizontal direction distance of 1 or less from the position of the pixel of interest, i.e., one G pixel in the position nearest from the pixel of interest among the G pixels of the two photographed images P1 and P2 after the positioning.
As described above, the photographed pixel used for the interpolation of the G signal of the pixel of interest is only one G pixel in the position nearest from the pixel of interest among the G pixels of the two photographed images P1 and P2 after the positioning. In FIG. 6, the G signals of the G pixels of the photographed image P1 after the positioning and the G signals of the G pixels of the photographed image P2 after the positioning near the pixels in the local small area “r” of the output image Pout take values significantly different from each other.
Therefore, the interpolation for generating the G signals of the plural pixels in the local small area “r” of the output image Pout may be performed using the G signal of the G pixels of the photographed image P1 after the positioning or may be performed using the G signals of the G pixels of the photographed image P2 after the positioning having a value significantly different from that of the G pixels. Thus, as shown in the third figure from the top of FIG. 6, the G signals of the plural pixels in the local small area “r” of the output image Pout take a value significantly different depending on a value of the G signals used for the interpolation.
In this way, the pixel value of the plural pixels in the local small area “r” of the output image Pout becomes a value significantly different because the moving object (the moving subject) appears in the photographed images P1 and P2 used for the interpolation. Consequently, zipper noise and false colors occur.
As shown in FIGS. 5 and 6, when the positions of the pixels of the first, photographed image P1 after the positioning and the positions of the pixels of the second photographed image P2 after the positioning deviate from each other by one pixel in the horizontal direction and the vertical direction, respectively, zipper noise occurs most conspicuously.
As described above, when one output image is generated using the plural photographed images including the moving object, besides zipper noise and false colors, a lost area in which a background is lost may be generated in the output image.
For example, JP-A-2005-12660 discloses a method, of generating an output image with a moving object removed using plural photographed images including the moving object.
FIG. 7 is a diagram for explaining the method of generating an output image with a moving object removed disclosed in JP-A-2005-12660.
Figures at the upper left and the upper right of FIG. 7 show a first photographed image P1 and a second photographed image P2 as the plural photographed images.
A moving object appears in both the first photographed image P1 and the second photographed image P2. However, positions where the moving object appears are different in the first photographed image P1 and the second photographed image P2.
A figure at the lower left of figure shows an output image Pout generated using the photographed images P1 and P2.
The moving object is removed in the output image Pout.
In the method disclosed in JP-A-2005-12660, the moving object is detected from the photographed images P1 and P2 by some method. An area of the moving object is deleted from one of the photographed images P1 and P2. An output image including only a background with the moving object removed shown in the figure at the lower left of FIG. 7 is generated by filling the deleted area with the identical area of the other photographed image.
According to the method disclosed in JP-A-2005-12660, as shown in figure when areas where the moving object appears do not overlap in the two photographed images P1 and P2, it is possible to obtain an output image without a lost area in which a background is lost.
However, as shown in FIG. 8, when areas in which the moving object appears overlap at least partially in the two photographed images P1 and P2, the overlapping area appears as a lost area in an output image.
FIG. 8 shows the two photographed images P1 and P2 as the plural photographed images and an output image generated using the two photographed images P1 and P2.
A figure at the upper left of FIG. 8 show the first photographed image P1. A figure at the upper right of FIG. 8 shows the second photographed image P2. A figure at the lower left of FIG. 8 shows the output image Pout it generated using the photographed images P1 and P2.
In FIG. 8, as in FIG. 7, the moving object appears in the first photographed image P1 and the second photographed image P2. However, a part of an area of the moving object in the first photographed image P1 and a part of an area of the moving object in the second photographed image P2 overlap each other.
In an overlapping area where the area of the moving object in the first photographed image P1 and the area of the moving object in the second photographed image P2 overlap each other, the background does not appear in both the two photographed images P1 and P2.
Therefore, when the area of the moving object is deleted from one of the photographed images P1 and P2 and the deleted area is filled by the identical area of the other photographed image, an output image in which the moving object is displayed only in the overlapping area indicated by hatching in the figure at the lower left of FIG. 8, i.e., an output image the overlapping area is a lost area in which the background is lost is obtained.