1. Field
Apparatuses and methods consistent with the exemplary embodiments relate to an image processing apparatus and an image processing method which is capable of effectively generating a high resolution image from a low resolution image through learning even under environments having various noises.
2. Description of the Related Art
Scaling for enlarging or reducing the size of an image is an important technique in the field of image display apparatuses. Recently, with rapid increase of a screen size and a resolution, the scaling technique has been developed to generate a high quality image, beyond simple enlargement or reduction of an image.
Super-resolution (SR) technique is one of a variety of techniques for generating a high quality image. The SR technique is classified into multiple-frame SR for extracting a single high resolution image from a plurality of low resolution images and single-frame SR for extracting a single high resolution image from a single low resolution image.
FIGS. 1A and 1B are diagrams illustrating the multi-frame SR. In the case of the multiple-frame SR, a single high resolution image is generated through registration, etc. from a plurality of image frames of the same scene which are slightly different in phase from each other. More specifically, if a low resolution (LR) image is inputted, a plurality of pixels is extracted from each of a plurality of image frames of the same scene which are slightly different in phase from each other. For example, as shown in FIG. 1A, a plurality of pixels ∘, a plurality of pixels ⋄, a plurality of pixels Δ and a plurality of pixels ● are sampled, respectively. In this respect, the pixels ∘, the pixels ⋄, the pixels Δ and the pixels ● are extracted from different image frames, respectively.
Then, pixels for forming high resolution image frames are generated on the basis of the sampled pixels. For example, as shown in FIG. 1B, a plurality of pixels □ may be generated using the pixels ∘, ⋄, Δ and ●. High resolution (HR) image frames may be generated from the pixels □.
This multiple-frame SR requires suitable movement estimation with respect to a plurality of image frames. Thus, the amount of operations is generally large, thereby causing difficulty in real-time processing. Further, a frame memory having a considerable size is required for storing the operations, thereby causing much difficulty in practical realization.
FIG. 2 is a diagram illustrating the single-frame SR. The single-frame SR is a learning-based technique which is used to overcome the problems of the multiple-frame SR. Referring to FIG. 2, in a learning process 210, pairs of blocks or patches having a predetermined size in consideration of image characteristics are generated using a variety of high resolution images and low resolution images corresponding to the high resolution images, and the generated pairs of blocks or patches are stored. In this respect, each pair of blocks or patches includes high resolution information and low resolution information.
For example, as shown in FIG. 2, in the learning process 210, the following operations are performed. First, low resolution images corresponding to a variety of high resolution images are extracted through low-pass filtering (LPF) and sub-sampling (212). Second, the low resolution images are scaled using predetermined interpolation such as cubic convolution (214). Third, low frequency components are removed from the original high resolution images and the scaled images using a Band-Pass Filter (BPF) or a High-Pass Filter (HPF). Then, examples of high frequency patches (HFP) having a predetermined size from which the low frequency components are removed and the corresponding scaled low frequency patches (LFP) are stored in a lookup table (LUT) (216).
In a synthesis or inference process (220), if an arbitrary low resolution image is input, a low resolution block in the pairs matching each block of the inputted image is searched, and high resolution information is obtained. For example, as shown in FIG. 2, in the synthesis or inference process (220), the following operations are performed. First, a low resolution image is inputted (222). Second, the inputted low resolution image is scaled, and every LFP is compared with the LFPs in the LUT. Then, an HFP corresponding to an optimal LFP which is selected in the LUT is used as a high frequency component of the inputted patch (224). Third, an extracted high resolution image is outputted (226). In this respect, the matching may be performed so that a high frequency component in a causal region which has been previously obtained in the optimal matching (searching) process is slightly overlapped, to thereby provide smoothness with respect to surrounding regions.
The single-frame SR has a relatively small operational amount compared with the multiple-frame SR. However, even in the case of the single-frame SR, since every LFP should be compared with all the LFPs in the LUT, an operational amount is significantly large in real applications. Further, there exists a problem in which scaling efficiency deteriorates according to a noise. Thus, it is necessary to provide a technique which can effectively reduce an operational amount while removing influences due to noises.
FIG. 3 illustrates a process of scaling an image mixed with noises. In order to scale the noise-mixed image, a cascade technique is typically used which firstly removes noises and then interpolates the image, as shown in FIG. 3. That is, referring to FIG. 3, if a low resolution image mixed with noises is inputted, the noises are firstly removed through a noise removing process (310), and new pixels are subsequently generated through an image interpolating process (320). In this way, a high resolution image may be output with noises being removed.
However, according to the cascade technique, noises may remain after passing through the noise removing process, which may cause deterioration in a scaling. Further, in the case of the learning-based single-frame SR, if a blurred image is inputted after noise removing, the SR process may be affected according to the degree of the blur.