In the case where a picture of standard resolution or low resolution (hereinafter referred to as an SD picture) is to be converted to a picture of high resolution (hereinafter referred to as an HD picture) or in the case where a picture is to be enlarged, the pixel value of a lacked pixel is interpolated (compensated) by a so-called interpolation filter.
However, since a component (high-frequency component) of the HD picture which is not included in the SD picture cannot be restored even by carrying out interpolation of a pixel by the interpolation filter, it has been difficult to provide a picture of high resolution.
Thus, the present Assignee has proposed a picture converting device (picture converting circuit) for converting an SD picture to an HD picture which also includes a high-frequency component not included in the SD picture.
In this picture converting device, adaptive processing for finding a prediction value of a pixel of the HD picture is carried out by linear combination of the SD picture and a predetermined prediction coefficient, thereby restoring the high-frequency component not included in the SD picture.
Specifically, it is now assumed that, for example, a prediction value E[y] of a pixel value y of a pixel constituting the HD picture (hereinafter referred to as an HD pixel) is to be found from a linear primary combination model prescribed by linear combination of pixel values (hereinafter referred to as learning data) x.sub.1, x.sub.2, . . . of several SD pixels (pixels constituting the SD picture) and predetermined prediction coefficients w.sub.1, w.sub.2, . . . . In this case, the prediction value E[y] may be expressed by Equation 1. EQU E[y]=w.sub.1 x.sub.1 +w.sub.2 x.sub.2 + Equation 1
If a matrix W consisting of a set of prediction coefficients w is defined by Equation 2, and a matrix X consisting of a set of learning data is defined by Equation 3 while a matrix Y' consisting of a set of prediction values E[y] is defined by Equation 4, in order to generalize the model, an observational equation like Equation 5 is obtained. ##EQU1##
Then, it is assumed that a prediction value E[y] proximate to a pixel value y of the HD pixel is to be found by applying a minimum square method to the observational equation. In this case, if a matrix Y consisting of true pixel values y of the HD pixels to be teacher data is defined by Equation 6 while a matrix E consisting of residuals e of the prediction values E[y] with respect to the pixel values y of the HD pixels is defined by Equation 7, a residual equation like Equation 8 is obtained from Equation 5. ##EQU2## EQU XW=Y+E Equation 8
In this case, a prediction coefficient w.sub.i for finding the prediction value E[y] proximate to the pixel value y of the HD pixel may be found by minimizing the square error expressed by Formula 9. ##EQU3##
Therefore, if the value obtained by differentiating the square error of Formula 9 by the prediction coefficient w.sub.i is 0, the prediction value w.sub.i satisfying Equation 10 is the optimum value for finding the prediction value E[y] proximate to the pixel value y of the HD pixel. ##EQU4##
Thus, by differentiating Equation 8 by the prediction coefficient w.sub.i, Equation 11 is obtained. ##EQU5##
Equation 12 is obtained from Equations 10 and 11. ##EQU6##
In addition, in consideration of the relation between the learning data x, the prediction coefficient w, the teacher data y and the residual e in the residual equation of Equation 8, a normal equation like Equation 13 may be obtained from Equation 12. ##EQU7##
The normal equation of Equation 13 may be established for the same number as the number of prediction coefficients w to be found. Therefore, the optimum prediction coefficient w may be found by solving Equation 13. (However, to solve Equation 13, the matrix consisting of the coefficients according to the prediction coefficients w must be regular.) In solving Equation 13, for example, a sweep method (Gauss-Jordan elimination method) may be applied.
In the foregoing manner, the set of optimum prediction coefficients w is found. Then, by using this set of prediction coefficients w, the prediction value E[y] proximate to the pixel value y of the HD pixel is found by Equation 1. The foregoing processing is adaptive processing. (Adaptive processing includes processing to find the set of prediction coefficients w in advance and find the prediction value from the set of prediction coefficients w.)
Adaptive processing differs from interpolation processing in that a component included in the HD picture which is not included in the SD picture is reproduced. Specifically, though adaptive processing is equal to interpolation processing using the so-called interpolation filter as far as Equation 1 is concerned, the prediction coefficient w corresponding to the tap coefficient of the interpolation filter is found from so-called learning by using teacher data y, thus enabling reproduction of the component included in the HD picture. That is, a picture of high resolution may be easily obtained. This indicates that adaptive processing is processing which has a picture creation effect.
FIG. 22 shows an example of the structure of a picture converting device for converting an SD picture into an HD picture by adaptive processing as described above based on the characteristics (class) of the picture.
The SD picture is supplied to a classifying circuit 101 and a delay circuit 102. The classifying circuit 101 sequentially uses SD pixels constituting the SD picture as notable pixels, and classifies the notable pixels into predetermined classes.
The classifying circuit 101 first forms a block (hereinafter referred to as a processing block) by collecting several SD pixels around a notable pixel, and supplies a value allocated in advance to a pattern of pixel value of all the SD pixels constituting the processing block, as the class of the notable pixel, to an address terminal (AD) of a coefficient ROM 104.
Specifically, the classifying circuit 101 extracts, for example, a processing block made up of 5.times.5 SD pixels (indicated by .smallcircle. in FIG. 23) around a notable pixel from the SD picture, as indicated by a rectangle of dotted line, and outputs a value corresponding to a pattern of pixel value of these 25 SD pixels as the class of the notable pixel.
To express the pixel value of each SD pixel, the number of patterns of pixel values of the 25 SD pixels is extremely large such as (2.sup.8).sup.25 patterns in the case where a large number of bits like eight bits are allocated. Therefore, the subsequent processing cannot be made quick.
Thus, as preprocessing prior to classification, processing for reducing the number of bits of the SD pixels constituting the processing block, for example, ADRC (Adaptive Dynamic Range Coding) processing, is carried out on the processing block.
In ADRC processing, first, an SD pixel having the maximum pixel value (hereinafter referred to as a maximum pixel) and an SD pixel having the minimum pixel value (hereinafter referred to as a minimum pixel) are detected from among the 25 SD pixels constituting the processing block. Then, the difference DR between the pixel value MAX of the maximum pixel and the pixel value MIN of the minimum pixel (=MAX-MIN) is calculated, and this DR is used as a local dynamic range of the processing block. On the basis of the dynamic range DR, the value of each pixel constituting the processing block is re-quantized to K bits which is smaller than the original number of allocated bits. That is, the pixel value MIN of the minimum pixel is subtracted from the pixel value of each pixel constituting the processing block, and each subtraction value is divided by DR/2.sup.K.
As a result, the value of each pixel constituting the processing block is expressed by K bits. Therefore, if K=1, the number of patterns of pixel values of the 25 SD pixels is (2.sup.1).sup.25, which is much smaller than the number of patterns in the case where ADRC processing is not carried out. ADRC processing for expressing the pixel value by K bits is hereinafter referred to as K-bit ADRC processing.
The coefficient ROM 104 stores, every class, a set of prediction coefficients found by learning in advance. When a class is supplied from the classifying circuit 101, the coefficient ROM 104 reads out a set of prediction coefficients stored at an address corresponding to the class, an supplies the read-out set of prediction coefficients to a prediction processing circuit 105.
Meanwhile, the delay circuit 102 delays the SD picture only by a time necessary for causing a timing at which the set of prediction coefficients are supplied from the coefficient ROM 104 and a timing at which a prediction tap is supplied from a prediction tap generating circuit 103, as later described, to coincide with each other. The delay circuit 102 then supplies the delayed SD picture to the prediction tap generating circuit 103.
The prediction tap generating circuit 103 extracts, from the SD picture supplied thereto, an SD pixel used for finding a prediction value of a predetermined HD pixel in the prediction processing circuit 105, and supplies the extracted SD pixel as a prediction tap to the prediction processing circuit 105. That is, the prediction tap generating circuit 103 extracts, from the SD picture, the same processing block as the processing block extracted by the classifying circuit 101, and supplies the SD pixels constituting the processing block as the prediction tap to the prediction processing circuit 105.
The prediction processing circuit 105 carries out arithmetic processing of Equation 1, that is, adaptive processing using the prediction coefficients w.sub.1, w.sub.2, . . . and the prediction taps x.sub.1, x.sub.2, . . . , thereby finding the prediction value E[y] of the notable pixel y. The prediction processing circuit 105 outputs this prediction value as the pixel value of the HD pixel.
For example, the prediction value of an HD pixel of 3.times.3 pixels (indicated by points .circle-solid. in FIG. 23) around the notable pixel, surrounded by a rectangle of solid line in FIG. 23, is found from one prediction tap. In this case, the prediction processing circuit 105 carries out arithmetic processing of Equation 1 with respect to the nine HD pixels. Therefore, the coefficient ROM 104 stores nine sets of prediction coefficients at an address corresponding to one class.
Similar processing is carried out by using the other SD pixels as notable pixels. Thus, the SD picture is converted to the HD picture.
FIG. 24 shows an example of the structure of a learning device for carrying out learning for calculating a set of prediction coefficients of every class which is to be stored in the coefficient ROM 104 of FIG. 22.
The HD picture to be teacher data y in learning is supplied to a thinning circuit 111 and a delay circuit 114. The thinning circuit 111 reduces the number of pixels of the HD picture by thinning, thus forming an SD picture. This SD picture is supplied to a classifying circuit 112 and a prediction tap generating circuit 113.
The classifying circuit 112 and the prediction tap generating circuit 113 carry out processing similar to the processing by the classifying circuit 101 and the prediction tap generating circuit 103 of FIG. 22, thus outputting the class of a notable pixel and a prediction tap, respectively. The class outputted by the classifying circuit 112 is supplied to address terminals (AD) of a prediction tap memory 115 and a teacher data memory 116. The prediction tap outputted by the prediction tap generating circuit 113 is supplied to the prediction tap memory 115.
The prediction tap memory 115 stores the prediction tap supplied from the prediction tap generating circuit 113, at an address corresponding to the class supplied from the classifying circuit 112.
Meanwhile, the delay circuit 114 delays the HD picture only by a time during which the class corresponding to the notable pixel is supplied from the classifying circuit 112 to the teacher data memory 116. The delay circuit 114 supplies only the pixel values of the HD pixels having the positional relation of FIG. 23 with respect to the prediction tap, as teacher data, to the teacher data memory 116.
The teacher data memory 116 stores the teacher data supplied from the delay circuit 114, at an address corresponding to the class supplied from the classifying circuit 112.
Similar processing is repeated until all the SD pixels constituting the SD pictures obtained from all the HD pictures prepared for learning are used as notable pixels.
Thus, at the same address in the prediction tap memory 115 or the teacher data memory 116, SD pixels having the same positional relation as the SD pixels indicated by 0 in FIG. 23 or HD pixels having the same positional relation as the HD pixels indicated by * are stored as learning data x or teacher data y.
In the prediction tap memory 115 and the teacher data memory 116, plural pieces of information may be stored at the same address. Therefore, at the same address, plural learning data x and teacher data y classified into the same class may be stored.
After that, the arithmetic circuit 117 reads out the prediction tap as the learning data or the pixel values of the HD pixels as the teacher data, stored at the same address in the prediction tap memory 115 or the teacher data memory 116, and calculates a set of prediction coefficients for minimizing an error between the prediction value and the teacher data by a minimum square method using the read-out data. That is, the arithmetic circuit 117 establishes the normal equation of Equation 13 for every class and solves this equation to find a set of prediction coefficients for every Thus, the set of prediction coefficients for every class found by the arithmetic circuit 117 is stored at an address corresponding to the class in the coefficient ROM 104 of FIG. 22.
In learning processing as described above, in some cases, a class such that a necessary number of normal equations for finding a set of prediction coefficients cannot be obtained is generated. With respect to such class, a set of prediction coefficients obtained by establishing and solving normal equations while ignoring the class is used as a so-called default set of prediction coefficients.
With the picture converting device of FIG. 22, from the SD picture obtained by reducing the number of pixels of the HD picture by thinning, the HD picture including high-frequency components not included in the SD picture may be obtained as described above. However, the proximity to the original HD picture is limited for the following reason. That is, it is considered that the pixel value of the pixel (SD pixel) of the SD picture obtained only by thinning the number of pixels of the HD picture is not optimum for restoring the original HD picture.
Thus, the present Assignee has proposed picture compression (coding) utilizing adaptive processing in order to obtain a decoded picture of quality proximate to that of the original HD picture (for example, in the JP Patent Application No.Hei 8-206552).
Specifically, FIG. 25 shows an example of the structure of a picture encoding device for compression (coding) the an original HD picture to an optimum SD picture so as to obtain a decoded picture proximate to the original HD picture by adaptive processing.
The HD picture as an encoding target is supplied to a thinning section 121 and an error calculating section 43.
The thinning section 121 makes an SD picture from the HD picture simply by thinning the HD picture, and supplies the SD picture to a correcting section 41. On receiving the SD picture from the thinning section 121, the correcting section 41, at first, directly outputs the SD picture to a local decode section 122. The local decode section 122 has a structure similar to that of the picture converting device of FIG. 22, for example. By carrying out adaptive processing as described above by using the SD picture from the correcting section 41, the local decode section 122 calculates a prediction value of the HD pixel and outputs the prediction value to the error calculating section 43. The error calculating section 43 calculates a prediction error (error information) of the prediction value of the HD pixel from the local decode section 122 with respect to the original HD pixel, and outputs the prediction error to a control section 44. The control section 44 controls the correcting section 41 in response to the prediction error from the error calculating section 43.
Thus, the correcting section 41 corrects the pixel value of the SD picture from the thinning section 121 under the control of the control section 44, and outputs the corrected pixel value to the local decode section 122. The local decode section 122 again finds a prediction value of the HD picture by using the corrected SD picture supplied from the correcting section 41.
Similar processing is repeated, for example, until the prediction error outputted from the error calculating section 43 reaches a predetermined value or less.
When the prediction error outputted from the error calculating section 43 reaches the predetermined value or less, the control section 44 controls the correcting section 41 so as to output the corrected SD picture at the time when the prediction error reaches the predetermined value or less, as an optimum encoding result of the HD picture.
Thus, by carrying out adaptive processing on this corrected SD picture, an HD picture having a prediction error at the predetermined value or less may be obtained.
The SD picture thus outputted from the picture encoding device of FIG. 25 may be regarded as the optimum SD picture for obtaining a decoded picture proximate to the original HD picture. Therefore, the processing carried out in a system constituted by the correcting section 41, the local decode section 122, the error calculating section 43 and the control section 44 of the picture encoding device may be referred to as optimization processing.
Meanwhile, adaptive processing is for constituting the prediction tap with SD pixels around the HD pixel and finding the prediction value of the HD pixel by using the prediction tap. The SD pixels used as the prediction tap are selected regardless of the picture.
That is, in the prediction tap generating circuit 103 of the picture converting device of FIG. 22 and the local decode section 122 of FIG. 25 constituted similarly to the picture converting device, a constant pattern of prediction tap is constantly generated or formed.
However, in many cases, the picture locally differs in characteristics. As the characteristics differ, it is considered that adaptive processing should be carried out by using prediction taps corresponding the different characteristics so as to obtain a decoded picture more proximate to the picture quality of the original HD picture.