1. Field of the Invention
This invention relates to a picture processing method and apparatus in which, in a device handling picture data, such as a computer handling picture data, an electronic still camera, a recording device, an editing equipment or a display equipment, high resolution picture data can be outputted from input low-resolution picture data.
2. Description of Related Art
As computers, digital cameras or networks are becoming more popular, it is becoming a frequent practice to deform or correct picture data, captured on a computer, by a picture data processing software. In particular, picture enlargement and contraction is practiced routinely. Above all, in enlarging a picture, the method for enlargement, that is a method for interpolation, is at issue, such that, depending upon the enlargement method used, the enlarged picture is different in picture quality. Representative of the interpolation methods are a nearest neighbor, linear interpolation and cubic convolution.
The nearest neighbor method is equivalent to an order-zero holding, with the number of pixels increasing with picture enlargement. The value of pixels lying physically closest to a pixel present from the outset is directly used as the value of added pixels.
The linear interpolation method uses values of pixels present from the outset on upper and lower sides and on left and right sides and averaged with weights proportional to the distance.
The cubic convolution method uses linearly filtered values of distant pixels as the values of the added pixels.
The above methods may be used in combination with a method of enhancing edges after interpolation to improve subjective picture quality.
The above-described interpolation methods suffer from a defect that, although the number of pixels can be increased, the spatial resolution cannot be improved beyond that of the original picture, and that, since the interpolation is not that by an ideal filter, aliasing tends to be produced.
The nearest neighbor method has a defect that a picture is blocked due in particular to aliasing such that a picture representing an oblique line is stepped line picture.
The cubic convolution method is affected to a lesser extent by aliasing since the interpolation is close to that by an ideal filter. However, since the spatial resolution is not changed from that of the original picture, the picture gives an impression of a subjectively blurred picture.
The linear interpolation is a method compromised between the two methods, such that a picture produced is also compromised between the pictures obtained with these methods. That is, the produced picture gives a blurred impression and also suffers from blocked distortion.
These inconveniences are particularly objectionable with a higher enlargement ratio, such that an enlarged picture appears to be non-optimum when seen at a shorter distance.
For improving the blurred impression, edge enhancement may be used in combination. The most routine method for edge enhancement is to find a waveform of an order-two differentiation and to add a moderate amount of the differentiated waveform to the original waveform. Although the blurred impression is thereby improved, pre-shooting or over-shooting is likely to be produced.
Recently, proposal has been made of a method of converting standard television signals into HD signals of the high-vision television grade. This technique is not simple interpolation. Specifically, HD and SD signals, previously prepared from the same source, are used as training data, and a database is produced with the HD and SD signals associated with each other. When the SD signals are inputted, the data pace is lowered by way of performing non-linear processing to output HD signals. However, this method is limited to the case of doubling the multiplication in the vertical and horizontal directions, such that the method is difficult to apply for higher multiplication.
Thus, as a method for supppressing blocked distortion even in enlargement to larger multiplication and for producing as clear a picture as pssible, a method known as a MAP (maximum posteriori) in IEEE Transactions on Image Processing, vol. 3, no.3, pp. 233 to 242, 1994). This method processes a picture by the nearest neighbor method as a starting picture to produce a target picture. Specifically, the values of respective pixels are updated to approach to a natural picture on the assumtion that the natural picture is smooth everywhere. This processing is repeated several times. The degree of non-smoothness of an entire picture, termed smoothness, is used as an energy, and a picture is updated using the steepest descent method, so that this energy will be decreased. With this method, the blocked distortion is not perceived, with the produced picture being clearer than a picture obtained with the cubic convolution method. However, this method has a drawback that the processing is sluggish because of the use of the steepest descent method.
In the MAP method, it is crucial how an energy function representing the above energy is to be determined. Representative of such energy function is the Huber function which is proportionate to the square of the smoothness and to the smoothness when the energy is small, that is when the picture in its entirety is smooth, and when the energy is large, that is when the picture in its entirety is not smooth, respectively. The reason of using this form of the function is that, since sharp edges are inherently contained in a picture, the picture is to be prevented from being excessively smooth to protect the edge, that is that the state of high smoothness (non-smoothness) is to be the state of excessively high energy state to prevent the energy from being lowered to prevent the energy from being lowered by repetitive operation to lead to excessive smoothness.
However, with the Huber function, it is necessary to set a parameter of setting a switching point between the power of two and the power of one. An optimum value of this parameter differs from picture to picture and hence it is not advisable to set this value unequivocally.
The known MAP method is applied to enlargement by a factor equal to an integer number. Therefore, if enlargement by a factor corresponding to an optional number is desired, the MAP method can be combined with other methods. The other methods are known, and hence are not explained here and enlargement by a factor equal to an integer number only is explained.
First, enlargement of an input picture by a factor equal to q in the vertical and horizontal directions is considered. If a low resolution input picture of M×N pixels is Y, a high resolution output picture is X, a decimated matrix with a vertical to horizontal ratio equal to 1/q is T and the white Gaussian noise is n, the relationship of the equation 1:Y=T×X+n  (1)holds, where T may be represented by the following equation (2):                     T        =                              1                          q              2                                ⁢                                          ⁢                                    (                                                                                          1                      ⁢                                                                                          ⁢                      …                      ⁢                                                                                          ⁢                      1                                                                                                  0                      ⁢                                                                                          ⁢                      …                      ⁢                                                                                          ⁢                      0                                                                            …                                                                                                                                                          0                                                                                                              0                      ⁢                                                                                          ⁢                      …                      ⁢                                                                                          ⁢                      0                                                                                                  1                      ⁢                                                                                          ⁢                      …                      ⁢                                                                                          ⁢                      1                                                                                                  0                      ⁢                                                                                          ⁢                      …                                                                                                                                                                              0                                                                                        ⋮                                                                                                                                                          ⋰                                                                                                                                                                                                                                                                                            0                                                        …                                                                                                                                                                                …                      ⁢                                                                                          ⁢                      0                                                                                                  1                      ⁢                                                                                          ⁢                      …                      ⁢                                                                                          ⁢                      1                                                                                  )                        .                                              (        2        )            
Although it is desirable to find a high resolution output X from the equation (2), X exists infinitely, such that X cannot be found algebraically. Therefore, the following suppositions are made:    Supposition 1: A picture is assumed to be a Markov random field. That is, pixel values are present only in the vicinity without dependency upon an entire picture. This may be said to be a reasonable supposition in a natural picture.    Supposition 2: The probability of a picture is given by the Gibbs density fuction shown by the following equation (3):                               Pr          ⁡                      (            X            )                          =                              1            C                    ⁢                      ⅇ                                                            1                  λ                                -                                                      ∑                                          v                      ∈                      V                                                        ⁢                                                            S                      V                                        ⁡                                          (                      X                      )                                                                                  ⁢                                                                                                      (        3        )            where C is a constant for normalization and Sv(X) is a function in a local point v in a picture for representing a value of smoothness (degree of non-smoothness). V denotes the entire picture and λ is a temperature parameter in the Gibbs function. This parameter is a constant and has no particular meaning.
The supposition 2 postulates that the smaller the smoothness, the higher is the probability, that is that a natural picture is approximately smooth. Therefore, this supposition may also be said to be reasonable.
On the other hand, if the noise added in producing an input picture is thought to be a Gaussian noise, the probability of the noise n may be represented by the following equation (4):                                           Pr            ⁢                                                  ⁢                          (              n              )                                ⁢                      |                          n              =                              Y                -                TX                                                    =                              1                                          (                                  2                  ⁢                                                                          ⁢                  π                  ⁢                                                                          ⁢                                      σ                    2                                                  )                                            NM                2                                              ⁢                                          ⁢                      ⅇ                                          -                                                                                                Y                      -                      TX                                                                            2                                                            2                ⁢                                                                  ⁢                                  σ                  2                                                                                        (        4        )            where σ is the standard deviation.
From the above model, an input is assumed to be a lowresolution picture Y, and deal high resolution picture X^ satisfying this supposition is found. From the above supposition, X^ maximizes Pr(X|Y). In general, Pr(X|Y) is represented by the equation (5):Pr(X|Y)=Pr(Y|X)×Pr(X)/Pr(Y)  (5)                where Y is the input picture and hence is known and Pr(Y)=1, while Pr(Y|X)=Pr(n). Therefore, Pr(Y|X) is represented by the following equation (6):Pr(X|Y)=Pr(X)×Pr(n)  (6).        
That is, for finding X^, it suffices to maximize the equation (6). It is however obvious from the equations (6), (3) and (4) that it is sufficient if ΣvεvSv^(X) is minimized, as may be seen from the following equation (7):                                           ∑                          v              ∈              V                                                                      ⁢                                          ⁢                                                    S                ^                            V                        ⁢                                                  ⁢                          (              X              )                                      =                                            ∑                              v                ∈                V                                                                                  ⁢                                                  ⁢                                          S                V                            ⁢                                                          ⁢                              (                X                )                                              +                      β            ⁢                                                  ⁢                                                                            Y                  -                  TX                                                            2                                                          (        7        )                            where β is a constant determined by trade-off between the smoothness Σsv(X) and the constraint ∥Y-TX∥.        
Now, the function Sv(X), representing the smoothness, is defined. Since Sv(X) is the local smoothness of a picture, the less smooth a picture, the larger must be the magnitude of the function. As a function that meets this condition, 3 vertically consecutive by 3 horizontally consecutive pixels as shown in FIG. 1 are considered and Sv(X) is defined as in the following equation (8):                                           S            v                    ⁢                                          ⁢                      (            X            )                          =                              ∑                          k              =              0                        3                    ⁢                                          ⁢                                    h              k              2                        ⁢                                                  ⁢                          (              X              )                                                          (        8        )            where hk(X) is a order-two FIR filter represented by the following equation (9):h0(X)=Xi+1,j−2Xi,j+Xi−1,j h1(X)=Xi+1,j−1−2Xi,j+Xi,j+1 h2(X)=Xi,,j−1−2Xi,j+Xi,j+1 h3(X)=Xi−1,j−1−2Xi,j+Xi+1,j+1  (9)
If Sv^ (X) is defined in this manner, it is a scalar quantity contrary to a picture X which is a vector quantity.
Therefore, the equation 7 may be taken as an energy function determined by X. Next, the equation 7, as the energy function, is minimized. To this end, the well-known steepest descent method may be used. Suppose that, by m'th calculations, all pixel values are updated as shown by the following equation 10:Xm+1=Xm−αmDm  (10).
In the steepest descent method, since Dm is the gradient of the energy function, shown by the equation 11:                               ∇                      (                                          ∑                                  v                  ∈                  V                                                                                              ⁢                                                          ⁢                                                                    S                    ^                                    V                                ⁢                                                                  ⁢                                  (                  X                  )                                                      )                          =                              ∇                          (                                                ∑                                      v                    ∈                    V                                                                                                          ⁢                                                                  ⁢                                                                            S                      ^                                        V                                    ⁢                                                                          ⁢                                      (                    X                    )                                                              )                                +                      ∇                          (                              β                ⁢                                                                  ⁢                                                                                                Y                      -                      TX                                                                            2                                            )                                                          (        11        )            this gradient must be found.
On the other hand, when the right side of the equation 10 is set as a uni-variable function z(αm) having αm as a variable, αm is determined by finding such as αm which minimizes       ∑          v      ∈      v                      ⁢          ⁢      Sv    ⋀          (              z        ⁢                                  ⁢                  (                      α            m                    )                    )      
This processing is repeated until       ∑          v      ∈      v                      ⁢          ⁢      Sv    ⋀          (              X        m            )      is substantially unchanged to find targeted X^.
Meanwhile, if this processing is repeated, the entire picture becomes smooth to approach to a natural picture. However, if a picture has abundant edges, these edged are also smoothed to give the impression of a blurred picture by way of an undesirable secondary effect. Therefore, the Huber function shown by the following equation (12):                                           ρ            T                    ⁢                                          ⁢                      (            X            )                          =                  {                                                                      X                  2                                                                                                                      X                                                        ≤                  T                                                                                                                          T                    2                                    +                                      2                    ⁢                    T                    ⁢                                                                                  ⁢                                          (                                                                                                  X                                                                          -                        T                                            )                                                                                                                                                            X                                                        >                  T                                                                                        (        12        )            is used. FIG. 2 shows the function in a graphic form. Using this function, Sv(X) is re-defined as in the following equation 13:                                           S            v                    ⁢                                          ⁢                      (            X            )                          =                              ∑                          k              =              0                        3                    ⁢                                          ⁢                                    ρ              T                        ⁢                                                  ⁢                                          (                                                      h                    k                                    ⁢                                                                          ⁢                                      (                    X                    )                                                  )                            .                                                          (        13        )            
By so doing, the gradient in case of a large magnitude of Sv(X) becomes smaller than in the equation 7, so that the degree of decrease of the value of the equation 7 by the steepest descent method is decreased. That is, the smoothness (degree of non-smoothness) is retained to prevent the edges in an edgy portion from becoming blurred to more than a necessary extent.
The foregoing is the explanation of the high definition technique by MAP so far known in the art.
So, in the MAP method, it is crucial how the energy function representing the above-mentioned energy is to be determined. To this end, the Huber function is preferentially used. This function is set so that it is proportional to the square of smoothness when the energy is small, that is when the picture in its entirety is smooth, while it is set so that it is proportional to smoothness when the energy is large, that is when the picture in its entirety is not smooth. The reason of doing this is that, since a picture inherently contains edges, excess smoothing needs to be prevented to protect the edges. Specifically, it is necessary to prevent the high smoothness state, that is the non-smooth state, from being converted to an excessively high energy state to prevent the energy from being lowered due to repetitive processing to bring about excess smoothness.
However, with the conventional MAP method, the picture energy is expressed by the Huber function, this energy being decreased by the steepest descent method. However, the calculations for finding a by the steepest descent method is expensive. On the other hand, it is necessary with the Huber function to define a parameter T determining a switching point between a power of 2 and a power of 1. However, this value is changed depending on a particular portion of a picture and hence it cannot be said to be optimum to set this value uniquely.