1. Field of the Invention
The present invention relates to an image processing device and method, a learning device and method, a program, and a recording medium. More specifically, the present invention relates to an image processing device and method, a learning device and method, a program, and a recording medium which make it possible to generate an image with high perceived resolution while reducing perceived noise.
2. Description of the Related Art
For transmission, accumulation, or the like of image data, a compression/encoding process based on JPEG, MPEG, H264/AVC or the like is frequently employed. For example, in the discrete cosine transform used in MPEG or JPEG, with an 8×8 dot area as a minimum unit, the screen is compressed by removing portions having relatively little variation within the area. Thus, if the bit rate is not sufficient, it is often the case that the original image is not fully reconstructed upon decoding, and block boundaries become clearly visible, causing block noise.
In the related art, as a scheme for reducing block noise produced in a digital image that has been encoded and then decoded, for example, a method of applying a filtering between blocks (or across the entire screen) such as the H264/AVC deblocking filter is adopted.
For example, in H264/AVC deblocking filter processing, whether or not block noise is easily visible is determined from the relationship between a block boundary and pixels in the vicinity of the boundary. A filtering is applied to the vicinity of the block boundary on the basis of the determination result, thereby smoothing the pixel values of an image so that the noise does not become conspicuous.
The H264/AVC deblocking filter processing is disclosed in detail in IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003 Adaptive Deblocking Filter Peter List, Anthony Joch, Jani Lainema, Gisle Bjontegaard, and Marta Karczewicz, and the like.
Also, a scheme also exists in which, in deblocking filter processing, the filter strength is obtained with reference to additional information such as full screen motion, bit rate, or picture size.
For example, in an encoding process based on the MPEG2 scheme, in order to impart general versatility to the encoding process and enhance the efficiency of compression by encoding, additional information for a decoding process is transmitted together with encoded image data. The additional information is inserted into the header in an MPEG2 stream and transmitted to a decoding device.
The characteristics of an image signal obtained by decoding vary greatly depending on the encoding and decoding scheme applied. For example, its physical characteristics (frequency characteristic and the like) differ greatly depending on the signal kind, such as a luminance signal, a color difference signal, or a three-color signal. This difference still remains in a decoded signal that has undergone an encoding and decoding process. Generally, in an image encoding and decoding process, it is frequently the case that the number of pixels to be encoded is reduced by introducing a spatio-temporal thinning process. The characteristics of the spatio-temporal resolution of an image differ greatly depending on the thinning method. Further, even when the difference in spatio-temporal resolution characteristics is small, image quality characteristics such as S/N, an encoding distortion, and the like vary greatly depending on the compression ratio (transmission rate) condition at the time of encoding.
The applicant has previously proposed classification adaptive processing. According to this processing, in a learning process, prediction coefficients are obtained for each class by using actual image signals (teacher and student signals) and accumulated in advance, and in the actual image transformation process, a class is obtained from an input image signal, and output pixel values are obtained by prediction computation between prediction coefficients corresponding to the class and a plurality of pixel values of the input image signal. The class is determined in accordance with the distribution or waveform of pixel values that are spatially or temporally adjacent to a pixel to be created. By computing prediction coefficients using the actual image signals, and computing the prediction coefficients for each class, various kinds of signal processing can be performed. For example, it is possible to perform processing such as a resolution creation process in which the spatio-temporal resolution is set equal to or higher than that of an input signal, interpolation of pixels thinned out by sub-sampling, noise reduction, and error correction.
There has been also proposed a technique related to digital signal processing which can enhance the accuracy of prediction by changing the extraction range or positions of a plurality of pieces of data used for classification or prediction computation, on the basis of additional information with respect to a digital information signal that has undergone an encoding and decoding process (see, for example, Japanese Unexamined Patent Application Publication No. 2001-285870).
A technique has been also proposed in which, in order to allow an interpolated pixel value close to a true value to be obtained, in a block formed by a plurality of pixels neighboring and centered around a position to be interpolated, the flatness in the vicinity of the center is detected, neighboring pixels are selected in such a way that the number of neighboring pixels to be selected at the center increases as the flatness becomes larger, and pixels to be generated are classified in accordance with the pattern of level distribution of the neighboring pixels.