(1) Field of the Invention
The present invention relates to a super-resolution processor that performs super-resolution processing on an input image to generate an output image, and a super-resolution processing method.
(2) Description of the Related Art
Recent years have seen improvements in display resolution of display devices such as home television displays and PC (Personal Computer) displays. In detail, there has been emergence of display devices having display capability of full high-definition (1920×1080 pixels) or more. Accordingly, in the case of displaying standard-definition (such as 720×480 pixels) video content of an existing DVD (Digital Versatile Disc) and the like on such a display device in full screen, it is necessary to perform high-resolution processing for increasing a resolution of an image to a display resolution of the display device. A currently predominant technique for this is enlargement processing using a linear filter. Moreover, a method called super-resolution that enables generation of high-resolution information not present in an input image has been receiving attention in recent years.
As a conventional super-resolution processing method, there is training-based super-resolution disclosed in Non-patent Reference 1 (Freeman, W. T. Jones, T. R. Pasztor, E. C., “Example-based super-resolution”, Computer Graphics and Applications, IEEE, March-April 2002). This method of Non-patent Reference 1 is described below.
(Structure and Operation 1 of a Super-Resolution Processor 900)
FIG. 21 is a block diagram of a super-resolution processor 900 in background art. The super-resolution processor 900 includes: an N enlargement unit 901 that generates an enlarged image 913 from an input image 911 of a low resolution; a high-pass filter unit 902 that generates a medium-frequency image 914 from the enlarged image 913; a patch extraction unit 903 that generates an estimated patch 917 from the medium-frequency image 914, a training medium-frequency patch 915, and a training high-frequency patch 916; an addition unit 904 that adds the estimated patch 917 to the enlarged image 913 to generate an output image 912; and a training database 905 that outputs the training medium-frequency patch 915 and the training high-frequency patch 916.
The N enlargement unit 901 enlarges the input image 911 N times in each of horizontal and vertical directions, where N is a factor for a desired resolution of super-resolution processing. The N enlargement unit 901 thus generates the enlarged image 913. For example, the N enlargement unit 901 enlarges the input image 911 using a pixel interpolation method such as bicubic interpolation or spline interpolation.
The high-pass filter unit 902 extracts a high-frequency component of the enlarged image 913 by linear filtering or the like, as the medium-frequency image 914.
The patch extraction unit 903 performs the following processing on the medium-frequency image 914, in units of fixed small blocks. The patch extraction unit 903 searches a large number of training medium-frequency patches 915 stored in the training database 905, for a training medium-frequency patch 915 most similar to a target image block in the medium-frequency image 914. A patch mentioned here is a block of data. The patch extraction unit 903 defines a distance between two patches by, for example, a sum of absolute differences or a sum of squared differences between pixels. The patch extraction unit 903 then determines similarity, according to how small the distance is. After the most similar training medium-frequency patch 915 is determined as a result of the search, the patch extraction unit 903 obtains a training high-frequency patch 916 paired with the determined training medium-frequency patch 915 in the training database 905, and outputs the obtained training high-frequency patch 916 as the estimated patch 917.
The addition unit 904 adds the estimated patch 917 to a patch at a target block position in the enlarged image 913 in units of pixels, and outputs an addition result as the output image 912.
The following describes a method of generating the training database 905 included in the super-resolution processor 900.
(Structure and Operation of a Training Database Generation Apparatus 950)
FIG. 22 is a block diagram of a training database generation apparatus 950 that generates the training database 905 in the background art. The training database generation apparatus 950 includes: a low-pass filter unit 951 that generates a training low-frequency image 962 from a training image 961 collected from an actual image captured by a digital camera beforehand and the like; a 1/N reduction unit 952 that generates a training low-resolution image 963 from the training low-frequency image 962; an N enlargement unit 953 that generates a training low-frequency image 964 from the training low-resolution image 963; a high-pass filter unit 954 that generates a training medium-frequency patch 915 from the training low-frequency image 964; a high-pass filter unit 955 that generates a training high-frequency patch 916 from the training image 961; and the training database 905 that stores the training medium-frequency patch 915 and the training high-frequency patch 916.
The low-pass filter unit 951 extracts a low-frequency component of the training image 961 by linear filtering or the like, as the training low-frequency image 962.
The 1/N reduction unit 952 reduces the training low-frequency image 962 by 1/N in each of the horizontal and vertical directions, to generate the training low-resolution image 963.
The N enlargement unit 953 enlarges the training low-resolution image 963 by the factor N in each of the horizontal and vertical directions, to generate the training low-frequency image 964.
The high-pass filter unit 954 extracts a high-frequency component of the training low-frequency image 964 by linear filtering or the like, and clips the extracted high-frequency component in units of fixed blocks mentioned earlier, thereby generating a plurality of training medium-frequency patches 915.
The high-pass filter unit 955 extracts a high-frequency component of the training image 961 by linear filtering or the like, and clips the extracted high-frequency component in units of fixed blocks mentioned earlier, thereby generating a plurality of training high-frequency patches 916.
The training database 905 associates a training medium-frequency patch 915 and a training high-frequency patch 916 generated from the same block position with each other, as one patch pair. The training database 905 stores data of both image patches and their correspondence relation.
(Operation 2 of the Super-Resolution Processor 900)
Thus, the training database 905 in the super-resolution processor 900 stores a large number of correspondence relations between actual medium-frequency images and high-frequency images, which are collected from actual images captured by a digital camera beforehand and the like. This allows the super-resolution processor 900 to search for a high-frequency image patch that is likely to be most related to a patch in the medium-frequency image 914. By adding the high-frequency image patch found as a result of the search to the enlarged image 913, a missing high-frequency component which is not present in the input image 911 can be added. Hence, the super-resolution processor 900 can generate a favorable output image 912.