1. Field of the Invention
The present invention relates to an image processing apparatus, an image processing method, and a computer program and, in particular, to an image processing apparatus, an image processing method, and a computer program for performing super-resolution processing for increasing the resolution of an image.
2. Description of the Related Art
Super-resolution processing has been used as a technique for generating a high-resolution image from a low-resolution image. In super-resolution processing, pixel values of pixels of one frame of a high-resolution image are obtained from multiple low-resolution images of the same object.
By using super-resolution processing, after images are captured with an image sensor, such as a charge coupled device (CCD) sensor or a complementary metal oxide semiconductor (CMOS) sensor, an image having a resolution higher than that of the image sensor can be reconstructed from images captured by the image sensor. More specifically, for example, super-resolution processing is used for generating a high-resolution satellite picture. Note that super-resolution processing is described in, for example, “Improving Resolution by Image Registration”, Michal IRANI and Shmuel PELEG, Department of Computer Science, The Hebrew University of Jerusalem, 91904 Jerusalem, Israel, Communicated by Rama Chellapa, Received Jun. 16, 1989; accepted May 25, 1990.
The principle of super-resolution is described below with reference to FIGS. 1A, 1B, and 2. Reference symbols a, b, c, d, e, and f shown in the upper sections of FIGS. 1A and 1B represent pixel values of a high-resolution image (a super-resolution (SR) image) to be generated from a low-resolution image (a low resolution (LR) image) obtained by capturing an object. That is, these reference symbols represent the pixel values of pixels generated when the object is pixelated with the same resolution as that of the SR image.
For example, when the width of one pixel of an image sensor is the same as the width of two pixels of the object, an image of the object cannot be captured with the original resolution. In such a case, as shown in FIG. 1A, the pixel at the left among three pixels of the image sensor acquires a pixel value A obtained by mixing the pixel values a and b. The pixel at the middle acquires a pixel value B obtained by mixing the pixel values c and d. In addition, the pixel at the right acquires a pixel value C obtained by mixing the pixel values e and f. Here, the pixel values A, B, and C represent the pixel values of pixels of a captured LR image.
As shown in FIG. 1B, if, due to, for example, camera shake, an image of the object at a position shifted from the position of the object shown in FIG. 1A by 0.5 pixels is captured (is captured while the shift is occurring), the pixel at the left among three pixels of the image sensor acquires a pixel value D obtained by mixing half the pixel value a, the whole pixel value b, and half the pixel value c. The pixel at the middle acquires a pixel value E obtained by mixing half the pixel value a, the whole pixel value d, and half the pixel value e. In addition, the pixel at the right acquires a pixel value F obtained by mixing half the pixel values e and the whole pixel value f. Here, the pixel values D, E, and F also represent the pixel values of pixels of a captured LR image.
As a result of the captured LR images, the following expression (1) can be obtained:
                                          (                                                            1                                                  1                                                  0                                                  0                                                  0                                                  0                                                                              0                                                  0                                                  1                                                  1                                                  0                                                  0                                                                              0                                                  0                                                  0                                                  0                                                  1                                                  1                                                                                                  1                    /                    2                                                                    1                                                                      1                    /                    2                                                                    0                                                  0                                                  0                                                                              0                                                  0                                                                      1                    /                    2                                                                    1                                                                      1                    /                    2                                                                    0                                                                              0                                                  0                                                  0                                                  0                                                                      1                    /                    2                                                                    1                                                      )                    ⁢                      (                                                            a                                                                              b                                                                              c                                                                              d                                                                              e                                                                              f                                                      )                          =                  (                                                    A                                                                    B                                                                    C                                                                    D                                                                    E                                                                    F                                              )                                    (        1        )            
By computing a, b, c, d, e, and f using expression (1), an image having a resolution higher than that of the image sensor can be obtained.
A method called “Back Projection” is one of super-resolution processing techniques of the related art. Processing using a back projection method is described in detail next with reference to FIG. 2. FIG. 2 is a block diagram of an image processing apparatus 1. For example, the image processing apparatus 1 is mounted in a digital camera. The image processing apparatus 1 processes a captured still image.
As shown in FIG. 2, the image processing apparatus 1 includes super-resolution processing units 11a to 11c, a summation processing unit 12, and an addition processing unit 13, and an SR image buffer 14. For example, a low-resolution LR images LR0, LR1, and LR2 are obtained through an image capturing operation. The LR images LR0 is input to the super-resolution processing unit 11a. The image LR1 is input to the super-resolution processing unit 11b. The image LR2 is input to the super-resolution processing unit 11c. The LR images LR0 to LR2 are successively captured images. The LR images LR0 to LR2 have the overlapping areas therein. In general, when images are successively captured, the areas of an object captured in the images are slightly shifted from one another due to, for example, camera shake. Thus, the images are not the same and partially have areas that overlap one another.
The super-resolution processing unit 11a generates a difference image representing the difference between the low-resolution image LR0 and a high-resolution SR image stored in the SR image buffer 14 and outputs a feedback value to the summation processing unit 12. The feedback value indicates a difference image having a resolution that is the same as that of the SR image.
Note that the SR image buffer 14 stores an SR image that is generated through the immediately previous super-resolution processing. At a time when processing is just started, no frames have been generated. In such a case, an image having a resolution that is the same as that of an SR image is obtained by upsampling the low-resolution image LR0. The obtained image is stored in the SR image buffer 14.
Similarly, the super-resolution processing unit 11b generates a difference image representing the difference between the low-resolution image LR1 of the next frame and a high-resolution SR image stored in the SR image buffer 14 and outputs a feedback value indicating the generated difference image to the summation processing unit 12.
Similarly, the super-resolution processing unit 11c generates a difference image representing the difference between the low-resolution image LR2 of a frame after the next frame and a high-resolution SR image stored in the SR image buffer 14 and outputs a feedback value indicating the generated difference image to the summation processing unit 12.
The summation processing unit 12 averages the feedback values supplied from the super-resolution processing units 11a to 11c so as to generate an image having a resolution the same as that of the SR image. The summation processing unit 12 then outputs the generated image to the addition processing unit 13. The addition processing unit 13 sums the SR image stored in the SR image buffer 14 and the SR image supplied from the summation processing unit 12 so as to generate and output a new SR image. The output of the addition processing unit 13 is supplied to outside the image processing apparatus 1 as a result of the super-resolution processing. In addition, the output of the addition processing unit 13 is supplied to the SR image buffer 14 and is stored in the SR image buffer 14.
FIG. 3 is a block diagram of an exemplary configuration of a super-resolution processing unit 11 (one of the super-resolution processing units 11a to 11c). As shown in FIG. 3, the super-resolution processing unit 11 includes a motion vector detecting unit 21, a motion compensation processing unit 22, a downsampling processing unit 23, an addition processing unit 24, an upsampling processing unit 25, and an inverse motion compensation processing unit 26.
The high-resolution SR image is read out from the SR image buffer 14 and is input to the motion vector detecting unit 21 and the motion compensation processing unit 22. The captured low-resolution image LRn is input to the motion vector detecting unit 21 and the addition processing unit 24.
The motion vector detecting unit 21 detects a motion vector with respect to the SR image using the input high-resolution SR image and low-resolution image LRn. The motion vector detecting unit 21 then outputs the detected motion vector to the motion compensation processing unit 22 and the inverse motion compensation processing unit 26. For example, the motion vector detecting unit 21 performs block matching between an SR image generated on the basis of a previously captured image and the input image LRn so as to generate a vector indicating the destination of movement of each block of the SR image in the input image LRn.
The motion compensation processing unit 22 motion-compensates the high-resolution SR image on the basis of the motion vector supplied from the motion vector detecting unit 21 so as to generate a motion compensation (MC) image. The motion compensation processing unit 22 then outputs the generated motion compensation image (MC image) to the downsampling processing unit 23. As used herein, the term “motion compensation processing” refers to processing in which the positions of pixels of an SR image are moved in accordance with a motion vector, and an SR image having compensated positions of pixels corresponding to those of a newly input image LRn is generated. That is, by moving the positions of pixels of an SR image, the position of an object captured in the SR image is made coincident with the position of the object captured in the image LRn. In this way, a motion compensation image (MC image) can be generated.
The downsampling processing unit 23 downsamples the image supplied from the motion compensation processing unit 22 so as to generate an image having a resolution the same as that of the image LRn. The downsampling processing unit 23 then outputs the generated image to the addition processing unit 24. The operation for obtaining a motion vector from an SR image and an image LRn and making the resolution of an image motion-compensated using the obtained motion vector equal to the resolution of the LR image corresponds to a simulation of a captured image performed on the basis of the SR image stored in the SR image buffer 14.
The addition processing unit 24 generates a difference image indicating the difference between the image LRn and the image simulated in such a manner. The addition processing unit 24 then outputs the generated difference image to the upsampling processing unit 25.
The upsampling processing unit 25 upsamples the difference image supplied from the addition processing unit 24 so as to generate an image having a resolution the same as that of the SR image. The upsampling processing unit 25 then outputs the generated image to the inverse motion compensation processing unit 26. The inverse motion compensation processing unit 26 inverse motion-compensates the image supplied from the upsampling processing unit 25 using the motion vector supplied from the motion vector detecting unit 21. The inverse motion compensation processing unit 26 then outputs a feedback value representing the image obtained through the inverse motion compensation to the summation processing unit 12 shown in FIG. 2. The position of the object captured in the image obtained through the inverse motion compensation is close to the position of the object captured in the SR image stored in the SR image buffer 14.
FIG. 4 illustrates an exemplary configuration of an image processing apparatus 30 that performs such super-resolution processing. In an image quality adjusting unit 32, an image captured by an image capturing unit 31, such as a CCD or CMOS sensor, is adjusted through contrast control and aperture control (edge enhancement). Thereafter, in an image compression unit 33, the image is compressed using a predetermined compression algorithm, such as MPEG compression. Subsequently, the image is recorded on a recording medium 34, such as a digital versatile disc (DVD), a magnetic tape, or a flash memory.
Super-resolution processing is performed when the image stored on the recording medium 34 is decoded and played back. In an image decoding unit 35, decoding processing is performed on the image recorded on the recording medium 34. Thereafter, in a super-resolution processing unit 36, the super-resolution processing described with reference to FIGS. 1 to 3 is performed on the decoded image. Thus, a high-resolution image is generated, and the generated high-resolution image is displayed on a display unit 37.
An output image obtained through this super-resolution processing is not limited to a moving image. For example, the output image may be a still image. In the case of a moving image, a plurality of frame images are used. In contrast, in the case of a still image, continuously captured still images are used. In continuously captured still images, the areas of the captured images may be slightly and gradually shifted relative to one another due to, for example, camera shake. By applying the super-resolution processing described with reference to FIGS. 1 to 3 to continuously captured still images, a high-resolution image can be generated.
While the example shown in FIG. 4 has been described with reference to processing performed by, for example, a video camera or a still camera, the processing can be applied to broadcast image data, such as data used for digital broadcast. By applying the super-resolution processing to a received image in a receiver, a high-resolution image can be generated and output. For example, in an example configuration shown in FIG. 5, a data transmission apparatus 40 transmits a low-resolution image. A data receiving apparatus 50 receives data transmitted from the data transmission apparatus 40 and performs super-resolution processing on the received data so as to generate and display a high-resolution image.
An image quality adjusting unit 42 of the data transmission apparatus 40 controls the quality of an image captured by an image capturing unit 41, such as a CCD or CMOS device, through contrast control and aperture control (edge enhancement). Thereafter, an image compression unit 43 encodes the image using a predetermined compression algorithm, such as MPEG compression. Subsequently, a transmission unit 44 transmits the encoded image.
The data transmitted from the transmission unit 44 is received by a receiving unit 51 of the data receiving apparatus 50. An image decoding unit 52 decodes the received data. Subsequently, a super-resolution processing unit 53 performs the super-resolution processing described with reference to FIGS. 1 to 3 so as to generate a high-resolution image. The high-resolution image is then displayed on a display unit 54.
As noted above, super-resolution processing can be applied to an image captured by a camera or communication image data. However, as described in FIGS. 4 and 5, image data subjected to the super-resolution processing has already been subjected to image quality control, such as contrast adjustment and aperture control (edge enhancement). That is, the image quality is controlled so that subjective image quality is improved and, subsequently, the image is compressed. Accordingly, it is highly likely that the image includes block noise and ringing noise due to the compression.
However, as described in FIGS. 1 to 3, super-resolution processing includes reconstruction of a high-resolution image using a correlation between images among a plurality of images. If super-resolution processing is performed on image data subjected to the above-described processing, such as image quality adjustment and image compression, an image having an excessively increased high-frequency range or increased compression noise may be generated, and therefore, the image quality may deteriorate.