1. Field of the Invention
The present invention relates to a video encoder, and more particularly to a method for controlling the quantization. The present invention proposes a method of performing an adaptive quantization control in consideration of the characteristic of the input picture and the capacity of buffer in an MPEG2 encoder.
2. Background of the Related Art
Generally, digital TV provides a superior picture quality on the TV receiver for the viewers. As a result, a growing interest in digital TV broadcasting has cultivated many efforts to compress video data for transmission and reproduction. Typically, a moving picture expert group (MPEG2) is used as the algorithm to compress video signals. Having a high compression rate ranging from approximately {fraction (1/40)} to {fraction (1/60)}, the MPEG2 algorithm enables digital data transmission of high quality through general broadcasting channels to entertain viewers at home. The MPEG encoder classifies a picture into an Intra (I) frame, a Predictive (P) frame or a Bidirectional (B) frame and a decoder decodes the picture based on the type of the frame. Each frame is also divided into macro block (MB) units constituted by 16xc3x9716 pixels.
For purpose of the following illustration, some terms will first be defined as follows. The parameter for quantization of DCT coefficients in a MB is called a xe2x80x9cquantizer_scalexe2x80x9d, the parameter determined by the buffer in a bit rate control of TM5 (the name of a test model used in establishment of MPEG2 standards) is called an xe2x80x9coffset Qjxe2x80x9d, and the parameter obtained by multiplying the offset Qj with an activity measure in a macro block is called a quantization parameter xe2x80x9cmquantjxe2x80x9d.
The quantization parameter is clipped as an integer ranging from 1 to 31 and is sent to a decoder as a 5-bit header information. The quantizer scale for quantizing the DCT coefficients is substantially determined by a q_scale_type function at the MPEG2. In particular, the relationship quantizer_scale=g(mquantj) is satisfied and a function g(xc2x7) is determined by the q_scale_type. There are two types of q_scale_type in the MPEG2. If the q_scale_type is xe2x80x980xe2x80x99, g(xc2x7) falls on a linear function, otherwise if q_scale_type is xe2x80x981xe2x80x99, g(xc2x7) is a non-linear function.
A method for performing a bit rate control proposed in the MPEG2 TM5 will be described briefly. FIG. 1 is a block diagram of a video encoder in the background art comprising an image memory 101 storing the original image in units of field or frame; a subtractor 102 obtaining a residual signal between the original image and a reconstructed image; and a DCT section 103 performing a DCT conversion of the residual signal. The value obtained by the DCT conversion is quantized at a quantizer 104 and the quantized signal is inverse-quantized at a inverse quantizer 105 prior to an inverse DCT (IDCT) conversion through an IDCT section 106. The IDCT converted signal is added to a motion-compensated signal at an adder 107 and stored in a memory 108. The video data stored in the memory 108 is subjected to motion estimation and compensation at a motion estimation/compensation (E/C) section 109 and sent to the subtractor 102 and the adder 107.
A complexity calculator 110 calculates the spatial activity of the image stored in the image memory 101. The spatial activity is generally a measure of the picture complexity or level detail of a macro block and will be further explained below. An encoder controller 111 is the bit rate controller and controls the quantization rate of the quantizer 104 in consideration of the calculated spatial activity and the capacity of a buffer 115. The VLC 112 variable length codes the quantized data and the motion vector (MV) coding section codes the motion vector from the motion E/C section 109. The VLC encoded data and the encoded MV are input to the buffer 115 and are transmitted to a decoder in the form of a bit stream data.
Particularly, the quantization is coarse for high spatial activity and less coarse for lower spatial activity. Thus, the spatial activity is utilized to control the bit rate control for quantization. Also, a defined bit rate is allocated to a group of pictures (GOP) according to a transfer bit rate and the bits are allocated to each picture according to the complexity of each picture I, P, and B. The global complexity X of each picture is given by Equation 1 below,
Xi=SiQi,Xp=SpQp,Xb=SbQbxe2x80x83xe2x80x83[Equation 1]
where Si, Sp and Sb are bits generated after the previous I, P and B pictures are encoded, and Qi, Qp and Qb are averages of the quantization parameters mquantj used in all macro blocks.
The complexity of the previous picture I, P, and B is used to obtain the bit allocation for the current picture of the same type and can be expressed by Equation 2 below.                                           T            i                    =                      max            ⁢                          {                                                R                                      1                    +                                                                                            N                          p                                                ⁢                                                  X                          p                                                                                                                      X                          i                                                ⁢                                                  K                          p                                                                                      +                                                                                            N                          b                                                ⁢                                                  X                          b                                                                                                                      X                          i                                                ⁢                                                  K                          b                                                                                                                    ,                                  bit_rate                                      8                    xc3x97                    picture_rate                                                              }                                      ⁢                  
                ⁢                              T            p                    =                      max            ⁢                          {                                                R                                                            N                      p                                        +                                                                                            N                          b                                                ⁢                                                  K                          p                                                ⁢                                                  X                          b                                                                                                                      X                          p                                                ⁢                                                  K                          b                                                                                                                    ,                                  bit_rate                                      8                    xc3x97                    picture_rate                                                              }                                      ⁢                  
                ⁢                              T            b                    =                      max            ⁢                          {                                                R                                                            N                      b                                        +                                                                                            N                          p                                                ⁢                                                  K                          b                                                ⁢                                                  X                          p                                                                                                                      X                          b                                                ⁢                                                  K                          p                                                                                                                    ,                                  bit_rate                                      8                    xc3x97                    picture_rate                                                              }                                                          [Equation  2]            
In Equation 2, Kp and Kb are constants depending on the quantization matrix, typically having values 1.0 and 1.4 in the TM5, respectively. R is the number of bits remaining after encoding the previous picture bits allocated to the GOP. The bit-rate is a channel transfer rate (bits/sec) and the picture-rate is the number of pictures decoded per second.
The value of R is adjusted when the bits are allocated to the pictures in the next GOP as in Equation 3,
R←G+Rxe2x80x83xe2x80x83[Equation 3]
where G=bit_ratexc3x97N/image_rate and N is the size of GOP. Np and Nb are the numbers of P and B images to be encoded within the current GOP.
The bit rate is controlled to encode the current picture at a rate adequate for the number of bits allocated, which is dependant upon the complexity in a picture. Also, assuming that an virtual buffer is assigned to each picture, the quantization parameters are regulated according to the state of the buffer. The state of each buffer may be expressed by Equation 4 before macro block j is encoded.
dij=di0+Bjxe2x88x921xe2x88x92{Tixc3x97(jxe2x88x921)}/MB_cnt
dpj=dp0+Bjxe2x88x921xe2x88x92{Tpxc3x97(jxe2x88x921)}/MB_cnt
dbj=db0+Bjxe2x88x921xe2x88x92{Tbxc3x97(jxe2x88x921)}/MB_cntxe2x80x83xe2x80x83[Equation 4]
The values di0, dp0 and db0 are the initial buffer values, which are actually the bit rate control difference of a picture from a previous picture of the same type. In other words, the initial buffer values are the differences between the number of bits allocated in coding the picture and the number of bits generated in coding a previous picture of the same type. MB_cnt is the total number of macro blocks for the image.
An offset Qj of the jth macro block is calculated by the following expression using the status information of the buffer when coding the (jxe2x88x921)th macro block,
Qj={31xc3x97d}/xcfx84xe2x80x83xe2x80x83[Equation 5]
where r=2xc3x97bit_rate/picture_rate.
Adaptive quantization is a method of changing the offset Qj according to the complexity of the current macro block used to enhance the subjective quality of the picture. Particularly, Qj is multiplied by a factor N_actj utilizing an actj value indicating the complexity of macro blocks. The factor N_actj may be expressed by Equation 6 below,                               N_act          j                =                                            2              xc3x97                              act                j                                      +            avg_act                                              act              j                        +                          2              xc3x97              avg_act                                                          [Equation  6]            
where actj represents the minimum of the variances in the subblocks of the macro block. In Equation 6, the actj is smaller than the average complexity of a current picture for portions sensitive to the human""s sight and accordingly the N_actj factor is also small. However, the N_actj factor is large for complex portion less sensitive to the sight because actj is larger than the average complexity. The quantization parameter mquantj is calculated by Equation 7 and is clipped as an integer ranging from 1 to 31.
mquantj=Qjxc3x97N_actjxe2x80x83xe2x80x83[Equation 7]
The quantization control in video encoding significantly affects the quality of image. However, the MPEG standard controls quantization using only the variance of macro blocks of the current image, and does not have effective measures to reduce blocking effects or to minimize the number of bits by taking into consideration the characteristic of the picture.
Noise in a recovered image is an inevitable problem which occurs during a lossy coding of the moving picture data. The blocking effect is the most significant noise appearing in the recovered image. Specifically, a visible discontinuities at the edges between macro blocks caused during the block coding is known as the blocking effect. Because coding of blocks is performed independently, the correlation between the edges of adjacent blocks are not taken into consideration by the MPEG standard method. As a result, discontinuities across block boundaries of recovered image occurs and would be noticeable as noise.
The extent of blocking effect is dependant upon the type of coded pictures. In an I picture, for example, blocking effect is not significantly visible because the DC coefficients DCT-converted are separately encoded precisely according to the DC in consideration of the significance of the DC coefficients. Actually, 720xc3x97480 sequences of the original image are coded at a bit rate of 6 Mbits/sec and decoded such that almost no blocking effect is shown in the I frame.
The blocking effect chiefy appears in the P or B pictures, particularly in an area where motion compensation is not accurately achieved. Referring to FIGS. 2(a) to (e), a blocking effect is shown in the block boundaries of the reconstruced frame as shown in part (e) when the error signal is coarsely quantized. This phenomena especially occurs in the boundary of objects making large motion. For an object with large motion, the spatial activity would also be high around the boundary of the object, resulting in a coarse quantization of such portion. As a result, the residual signals for compensating the error of the prediction frame almost disappear, such that the blocking effect in the prediction frame still appears as shown in FIG. 2(b).
The original image may be accurately recovered from the prediction information if the motion compensation is well achieved, and although there may be significant damage on the residual information, the recovered image will have little blocking effect. Thus, problems may occur when the quantization of the error signal is controlled by considering only the spatial activity in the current original image as in the MPEG2 TM5 without considering the motion.
Moreover, the blocking effect is not visible to the naked eye in the recovered image at areas which is too dark or bright. It is therefore efficient to quantize this portion more coarsely and use the saved bits from this portion for other portion. However, the MPEG TM5 also does not take the above effect into consideration, and performs quantization regardless whether the image area is dark or bright.
Accordingly, an object of the present invention is to solve at least the problems and disadvantages of the related art.
An object of the present invention is to provide an adaptive quantization control method with improved image quality.
Another object of the present invention is to provide an efficient bit rate control for the quantization.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and advantages of the invention may be realized and attained as particularly pointed out in the appended claims.
To achieve the objects and in accordance with the purposes of the invention, as embodied and broadly described herein, an adaptive quantization decomposes one image into a plurality of macro blocks; encodes the image in block unit and controls quantization by using the complexity of the blocks. The adaptive quantization includes obtaining a spatial activity by comparing the complexity of the current block to be encoded to the entire blocks; obtaining a slope activity by comparing the degree of motion in the current block to the entire block; and controlling the quantization by using the spatial activity if the spatial activity is below a reference value and using the slope activity if the spatial activity exceeds the reference value.
FIG. 3 is a diagram comparing the TM5 method to the present invention, in which the portion above the horizontal line shows the TM5 method and the portion below the horizontal line shows the present invention method. The vertical line indicates the different levels of spatial activity in the TM5. Particularly, if the spatial activity is below the threshold S0, the TM5 method would be adequate. However, if the spatial activity is higher than the threshold value S0, a slope activity is additionally utilized for a more precise quantization. The slope activity is used to search and determine an area where blocking effect may appear. Accordingly, the inventive concept of the present invention is perform fine quantization of a macro block having high possibility of blocking effect at the block boundary. The slope activity will be described below.
Moreover, according to the present invention, within the portion determined to be quantized finely using the slope activity, a texture region in which blocking effect would not visibly appear to the naked eye is extracted. This texture region is quantized coarsely because a precise quantization is not necessary. Furthermore, after a reduction of the blocking effect, the portion too bright or dark is quantized coarsely in consideration of the luminance masking effect.
Therefore, as shown in FIG. 3, if the spatial activity is below a first threshold S0, the TM5 method is directly used to control quantization. If the spatial activity is between the threshold values S0 and S1, spatial activity and slope activity are taken into consideration to control quantization. If the spatial activity is above the threshold value S1, the slope activity and texture extraction are taken into consideration to control quantization. After these considerations, the brightness masking is further taken into consideration to control the quantization.