In the case where a picture of standard resolution or low resolution (hereinafter referred to as an SD picture) is to be converted to a picture of high resolution (hereinafter referred to as an HD picture) or in the case where a picture is to be enlarged, the pixel value of a lacked pixel is interpolated (compensated) by a so-called interpolation filter.
However, since a component (high-frequency component) of the HD picture which is not included in the SD picture cannot be restored even by carrying out interpolation of a pixel by the interpolation filter, it has been difficult to provide a picture of high resolution.
Thus, the present Assignee has proposed a picture converting device (picture converting circuit) for converting an SD picture to an HD picture which also includes a high-frequency component not included in the SD picture.
In this picture converting device, adaptive processing for finding a prediction value of a pixel of the HD picture is carried out by linear combination of the SD picture and a predetermined prediction coefficient, thereby restoring the high-frequency component not included in the SD picture.
Specifically, it is now assumed that, for example, a prediction value Ey! of a pixel value y of a pixel constituting the HD picture (hereinafter referred to as an HD pixel) is to be found from a linear primary combination model prescribed by linear combination of pixel values (hereinafter referred to as learning data) x.sub.1, x.sub.2, . . . of several pixels constituting the SD picture (hereinafter referred to as SD pixels) and predetermined prediction coefficients w.sub.1, w.sub.2, . . . . In this case, the prediction value Ey! may be expressed by Equation 1. ##EQU1##
If a matrix W consisting of a set of prediction coefficients w is defined by Equation 2, and a matrix X consisting of a set of learning data is defined by Equation 3 while a matrix Y' consisting of a set of prediction values Ey! is defined by Equation 4, in order to generalize the model, an observational equation like Equation 5 is obtained. ##EQU2##
Then, it is assumed that a prediction value Ey! proximate to a pixel value y of the HD pixel is to be found by applying a minimum square method to the observational equation. In this case, if a matrix Y consisting of true pixel values y of the HD pixels to be teacher data is defined by Equation 6 while a matrix E consisting of residuals e of the prediction values Ey! with respect to the pixel values y of the HD pixels is defined by Equation 7, a residual equation like Equation 8 is obtained from Equation 5. ##EQU3##
In this case, a prediction coefficient w.sub.i for finding the prediction value Ey! proximate to the pixel value y of the HD pixel may be found by minimizing the square error expressed by Formula 9. ##EQU4##
Therefore, if the value obtained by differentiating the square error of Formula 9 by the prediction coefficient w.sub.i is 0, that is, if Equation 10 is satisfied, the prediction value w.sub.i is the optimum value for finding the prediction value Ey! proximate to the pixel value y of the HD pixel. ##EQU5##
Thus, by differentiating Equation 8 by the prediction coefficient w.sub.i, Equation 11 is obtained. ##EQU6##
Equation 12 is obtained from Equations 10 and 11. ##EQU7##
In addition, in consideration of the relation between the learning data x, the prediction coefficient w, the teacher data y and the residual e in the residual equation of Equation 8, a normal equation like Equation 13 may be obtained from Equation 12. ##EQU8##
The normal equation of Equation 13 may be established for the same number as the number of prediction coefficients w to be found. Therefore, the optimum prediction coefficient w may be found by solving Equation 13. (However, to solve Equation 13, the matrix consisting of the coefficients according to the prediction coefficients w must be regular.) In solving Equation 13, for example, a sweep method (Gauss-Jordan elimination method) may be applied.
In the foregoing manner, the optimum prediction coefficients w are found. Then, by using these prediction coefficients w, the prediction value Ey! proximate to the pixel value y of the HD pixel is found by Equation 1. The foregoing processing is adaptive processing. (Adaptive processing includes processing to find the prediction coefficients w in advance and find the prediction value by using the prediction coefficients w.)
Adaptive processing differs from interpolation processing in that a component included in the HD picture which is not included in the SD picture is reproduced. Specifically, though adaptive processing is equal to interpolation processing using the so-called interpolation filter as far as Equation 1 is concerned, the prediction coefficient w corresponding to the tap coefficient of the interpolation filter is found from so-called learning by using teacher data y, thus enabling reproduction of the component included in the HD picture. That is, in adaptive processing, a picture of high resolution may be easily obtained. In other words, it may be understood that adaptive processing is processing which has a picture creation effect.
FIG. 1 shows an example of the structure of a picture converting device (picture converting circuit) for converting an SD picture into an HD picture by adaptive processing as described above based on the characteristics (class) of the picture.
The SD picture is supplied to a classifying circuit 101 and a delay circuit 102. The classifying circuit 101 sequentially uses SD pixels constituting the SD picture as notable pixels, and classifies the notable pixels into predetermined classes.
Specifically, the classifying circuit 101 first forms a block (hereinafter referred to as a processing block) by collecting several SD pixels around a notable pixel, and supplies a value allocated in advance to a pattern of pixel value of all the SD pixels constituting the processing block, as the class of the notable pixel, to an address terminal (AD) of a coefficient ROM 104.
Specifically, the classifying circuit 101 extracts, for example, a processing block made up of 5.times.5 SD pixels (indicated by .smallcircle. in FIG. 2) around a notable pixel from the SD picture, as indicated by a rectangle of dotted line in FIG. 2, and outputs a value corresponding to a pattern of pixel value of these 25 SD pixels as the class of the notable pixel.
In the case where a large number of bits like eight bits are allocated to express the pixel value of each SD pixel, the number of patterns of pixel values of the 25 SD pixels is extremely large such as (2.sup.8).sup.25 patterns. Therefore, it is difficult to carry out the subsequent processing quickly.
Thus, as preprocessing prior to classification, processing for reducing the number of bits of the SD pixels constituting the processing block, for example, ADRC (adaptive dynamic range coding) processing, is carried out on the processing block.
In ADRC processing, first, an SD pixel having the maximum pixel value (hereinafter referred to as a maximum pixel) and an SD pixel having the minimum pixel value (hereinafter referred to as a minimum pixel) are detected from among the 25 SD pixels constituting the processing block. Then, the difference DR between the pixel value MAX of the maximum pixel and the pixel value MIN of the minimum pixel (=MAX-MIN) is calculated, and this DR is used as a local dynamic range of the processing block. On the basis of the dynamic range DR, the value of each pixel constituting the processing block is re-quantized to K bits which is smaller than the original number of allocated bits. That is, the pixel value MIN of the minimum pixel is subtracted from the pixel value of each pixel constituting the processing block, and each substraction value is divided by DR/2.sup.K.
As a result, the value of each pixel constituting the processing block is expressed by K bits. Therefore, if K=1, the number of patterns of pixel values of the 25 SD pixels is (2.sup.1).sup.25, which is much smaller than the number of patterns in the case where ADRC processing is not carried out. ADRC processing for expressing the pixel value by K bits is hereinafter referred to as K-bit ADRC processing.
The coefficient ROM 104 stores, every class, a set of prediction coefficients found by learning in advance. When a class is supplied from the classifying circuit 101, the coefficient ROM 104 reads out a set of prediction coefficients stored at an address corresponding to the class, and supplies the read-out set of prediction coefficients to a prediction processing circuit 105.
Meanwhile, the delay circuit 102 delays the SD picture only by a time necessary for causing a timing at which the set of prediction coefficients are supplied from the coefficient ROM 104 to the prediction processing circuit 105 and a timing at which a prediction tap is supplied from a prediction tap generating circuit 103, as later described, to coincide with each other. The delay circuit 102 then supplies the delayed SD picture to the prediction tap generating circuit 103.
The prediction tap generating circuit 103 extracts, from the SD picture supplied thereto, an SD pixel used for finding a prediction value of a predetermined HD pixel in the prediction processing circuit 105, and supplies the extracted SD pixel as a prediction tap to the prediction processing circuit 105. Specifically, the prediction tap generating circuit 103 extracts, from the SD picture, the same processing block as the processing block extracted by the classifying circuit 101, and supplies the SD pixels constituting the processing block as the prediction tap to the prediction processing circuit 105.
The prediction processing circuit 105 carries out arithmetic processing of Equation 1, that is, adaptive processing using the prediction coefficients w.sub.1, w.sub.2, . . . from the coefficient ROM 104 and the prediction taps x.sub.1, x.sub.2, . . . from the prediction tap generating circuit 103, thereby finding the prediction value Ey! of the notable pixel y. The prediction processing circuit 105 outputs this prediction value as the pixel value of the HD pixel.
For example, the prediction value of an HD pixel of 3.times.3 pixels (indicated by points .cndot. in FIG. 2) around the notable pixel, surrounded by a rectangle of solid line in FIG. 2, is found from one prediction tap. In this case, the prediction processing circuit 105 carries out arithmetic processing of Equation 1 with respect to the nine HD pixels. Therefore, the coefficient ROM 104 stores nine sets of prediction coefficients at an address corresponding to one class.
Similar processing is carried out by using the other SD pixels as notable pixels. Thus, the SD picture is converted to the HD picture.
FIG. 3 shows an example of the structure of a learning device (learning circuit) for carrying out learning for calculating prediction coefficients to be stored in the coefficient ROM 104 of FIG, 1.
The HD picture to be teacher data y in learning is supplied to a thinning circuit 111 and a delay circuit 114. The thinning circuit 111 reduces the number of pixels of the HD picture by thinning, thus forming an SD picture. This SD picture is supplied to a classifying circuit 112 and a prediction tap generating circuit 113.
The classifying circuit 112 and the prediction tap generating circuit 113 carry out processing similar to the processing by the classifying circuit 101 and the prediction tap generating circuit 103 of FIG. 1, thus outputting the class of a notable pixel and a prediction tap, respectively. The class outputted by the classifying circuit 112 is supplied to address terminals (AD) of a prediction tap memory 115 and a teacher data memory 116. The prediction tap outputted by the prediction tap generating circuit 113 is supplied to the prediction tap memory 115.
The prediction tap memory 115 stores the prediction tap supplied from the prediction tap generating circuit 113, at an address corresponding to the class supplied from the classifying circuit 112.
Meanwhile, the delay circuit 114 delays the HD picture only by a time during which the class corresponding to the notable pixel is supplied from the classifying circuit 112 to the teacher data memory 116. The delay circuit 114 supplies only the pixel values of the HD pixels located around the SD pixel as the notable pixel, as teacher data, to the teacher data memory 116.
The teacher data memory 116 stores the teacher data supplied from the delay circuit 114, at an address corresponding to the class supplied from the classifying circuit 112.
Similar processing is repeated until all the HD pixels constituting the HD pictures prepared for learning are used as notable pixels.
Thus, at the same address in the prediction tap memory 115 or the teacher data memory 116, the pixel value of SD pixels having the same positional relation as the SD pixels indicated by .smallcircle. in FIG. 2 or HD pixels having the same positional relation as the HD pixels indicated by .cndot. are stored as learning data x or teacher data y.
In the prediction tap memory 115 and the teacher data memory 116, plural pieces of information may be stored at the same address. Therefore, at the same address, plural learning data x and teacher data y classified into the same class may be stored.
After that, the arithmetic circuit 117 reads out the prediction tap as the learning data or the pixel values of the HD pixels as the teacher data, stored at the same address in the prediction tap memory 115 or the teacher data memory 116, and calculates a set of prediction coefficients for minimizing an error between the prediction value and the teacher data by a minimum square method using the read-out data. That is, the arithmetic circuit 117 establishes the normal equation of Equation 13 for every class and solves this equation to find a set of prediction coefficients for every class.
Thus, the set of prediction coefficients for every class found by the arithmetic circuit 117 is stored at an address corresponding to the class in the coefficient ROM 104 of FIG. 1.
In learning processing as described above, in some cases, a class such that a necessary number of normal equations for finding prediction coefficients cannot be obtained is generated. With respect to such class, a set of prediction coefficients obtained by establishing and solving normal equations while ignoring the class is used as a so-called default set of prediction coefficients.
With the picture converting device of FIG. 1, from the SD picture obtained by reducing the number of pixels of the HD picture by thinning, the HD picture including high-frequency components not included in the SD picture may be obtained as described above. However, the proximity to the original HD picture is limited for the following reason. That is, it is considered that the pixel value of the pixel (SD pixel) of the SD picture obtained only by thinning the number of pixels of the HD picture is not optimum for restoring the original HD picture.
Thus, the present Assignee has proposed picture compression (coding) utilizing adaptive processing in order to obtain a decoded picture of quality proximate to that of the original HD picture (for example, in the JP Patent Application No.Hei 8-206552).
Specifically, FIG. 4 shows an example of the structure of a picture signal encoding device for compression (coding) an original HD picture to an optimum SD picture so as to obtain a decoded picture proximate to the original HD picture by adaptive processing.
The HD picture as an encoding target is supplied to a thinning section 121 and an error calculating section 124.
The thinning section 121 makes an SD picture simply by thinning the pixels of the HD picture, and supplies the SD picture to a correcting section 122. On receiving the SD picture from the thinning section 121, the correcting section 122, at first, directly outputs the SD picture to a local decode section 123. The local decode section 123 has a structure similar to that of the picture converting device of FIG. 1, for example. By carrying out adaptive processing as described above by using the SD picture from the correcting section 122, the local decode section 123 calculates a prediction value of the HD pixel and outputs the prediction value to the error calculating section 124. The error calculating section 124 calculates a prediction error of the prediction value of the HD pixel from the local decode section 123 with respect to the original HD pixel, and outputs the prediction error to a control section 125. The control section 125 controls the correcting section 122 in accordance with the prediction error from the error calculating section 124.
Thus, the correcting section 122 corrects the pixel value of the SD picture from the thinning section 121 under the control of the control section 125, and outputs the corrected pixel value to the local decode section 123. The local decode section 123 again finds a prediction value of the HD picture by using the corrected SD picture supplied from the correcting section 122.
Similar processing is repeated, for example, until the prediction error outputted from the error calculating section 124 reaches a predetermined value or less.
When the prediction error outputted from the error calculating section 124 reaches the predetermined value or less, the control section 125 controls the correcting section 122 so as to output the corrected SD picture at the time when the prediction error reaches the predetermined value or less, as an optimum encoding result of the HD picture.
Thus, by carrying out adaptive processing on this corrected SD picture, an HD picture having a prediction error at the predetermined value or less may be obtained.
The SD picture thus outputted from the picture signal encoding device of FIG. 4 may be regarded as the optimum SD picture for obtaining a decoded picture proximate to the original HD picture. Therefore, the processing carried out in a system constituted by the correcting section 122, the local decode section 123, the error calculating section 124 and the control section 125 of the picture signal encoding device may be referred to as optimization processing.
Meanwhile, to obtain a prediction value more proximate to the pixel value of the original HD pixel, it is preferred that the prediction tap used in adaptive processing is constituted from a large number of SD pixels close to the HD pixel as the target for finding the prediction value.
However, if the prediction tap is constituted from a large number of SD pixels, SD pixels relatively far from the HD pixel as the target for finding the prediction value are included in the prediction tap. Therefore, in this case, SD pixels expressing an object difference from the object expressed by the HD pixel as the target for finding the prediction value might be included in the prediction tap. Consequently, the precision of the prediction value is deteriorated, and a decoded picture formed by this prediction value is deteriorated.
Thus, it may be considered to use a method for reducing the number of pixels to be thinned from the HD picture in the thinning section 121 of the picture signal encoding device of FIG. 4 and thus increasing SD pixels close to the HD pixel as the target for finding the prediction value. However, this deteriorates the coding efficiency.