1. Field of the Invention
The present invention relates in general to a video communication field, and more particularly to a bit allocation method for controlling a data transmission rate through a channel in an MPEG2 video encoder which realizes a video compression method for reducing a band width in data transmission.
2. Description of the Prior Art
It is anticipated that video information will be a very important information source in the future information society and the demand for communications using digital video compression and storage media will be explosively increased. In particular, an MPEG2 video algorithm can widely be applied to video compression methods for digital video communication media, storage media, multimedia and high definition television (HDTV).
A transmission rate control method is a very important one of the MPEG2 associated techniques because it is connected directly with a picture quality of a restored video signal.
Generally, in the case where color motion video data is to be expressed in a digital manner, there is required very much data amount. For this reason, the direct transmission or storage of the digital color motion video data results in a transmission channel or a storage device being wasteful. Therefore, video compression and expansion methods are used to reduce the data amount in the data transmission or storage.
Such a video compression method utilizes a property of video data having much redundancy. Namely, the video compression method is adapted to reduce the data amount in the transmission/storage by removing the redundancy of the video data.
In JTC1/SC29/WG11 as an affiliated organization of ISO/IEC, the standard MPEG1 of a motion video compression coding method for a storage medium in the 1.15 Mbps class was established in the year 1991 by a Moving Picture Experts Group (MPEG). Also, the standard MPEG2 of a motion video compression coding method for communication/storage/broadcasting in the 4 Mbps or more class is in progress as of the year 1993.
The MPEG2 in progress is a DCT/DPCM composite coding algorithm which utilizes discrete cosine transform (DCT) and motion compensation to remove spatial and time redundancies of an input video signal, respectively. The MPEG2 also utilizes a variable length code (VLC) to remove a statistic redundancy of the input video signal. For this reason, in a video encoder performing the video compression, the generated data amount is very variable according to a characteristic of the input video signal.
For the purpose of transmitting the variable output data from the video encoder through a channel of a fixed band, buffers are required respectively in the encoder and a decoder. In the encoder, the data to be transmitted through the channel must be prevented from overflowing the buffer or being exhausted therein. Also, the data amount must be controlled to be matched with a capacity of the transmission channel which is called a transmission rate control.
The transmission rate control is performed by adjusting a step size of a quantizer. Because the quantization of the quantizer is a lossy coding process, a picture quality of the decoded video signal from the decoder is greatly dependent on the transmission rate control.
In the other words, in the case where the step size of the quantizer is made large, the data amount transmitted to the buffer is reduced, whereas the quantized data is degraded in accuracy. For this reason, the picture quality of the decoded video signal is degraded as compared with that of the original video signal. On the contrary, in the case where the step size of the quantizer is made small, the quantized data is increased in accuracy although the data amount transmitted to the buffer is increased. In this case, the picture quality of the decoded video signal is substantially the same as that of the original video signal.
The MPEG standard is not connected with the transmission rate control because it establishes a syntax of a bit stream inputted in the decoding process. Therefore, the transmission rate control is a know-how of the encoded which is capable of maintaining the picture quality at the maximum at a given band.
The MPEG2 algorithm has three coding modes, an intra mode (referred to hereinafter as I mode), a predicted mode (referred to hereinafter as P mode) and a dibirectionally predicted mode (referred to hereinafter as B mode), which will hereinafter be described in detail with reference to FIG. 1 which is a block diagram of an MPEG2 video encoder.
The I mode is to perform the coding using the video signal itself. The I mode is periodically performed for random access to a coded bit stream. Also, the I mode is required in preventing a degradation in the picture quality due to an error of a storage medium. A video signal of the I mode is used as a reference for video motion prediction in the p and B modes.
The P mode is to perform the coding by compensating for the motion from a previous video signal of the I or P mode.
The B mode is to perform the coding by compensating for the motion from the previous video signal of the I or P mode or the subsequent video signal of the I or P mode.
The video coding method in the I mode will hereinafter be described in detail.
In the video encoder of FIG. 1, a DCT unit 11 partitions pixels of a video frame into N.times.N blocks and performs the DCT for every block to remove the spatial redundancy of the video data. A quantizer 12 quantizes DCT coefficients from the DCT unit 11 using a desired step size. The quantized DCT coefficients from the quantizer 12 are converted into a one-dimensional array in a zig-zag scanning manner. The quantized DCT coefficients of the one-dimensional array are converted into the combination of 0 and 1 by a variable length coder (VLC) and then transmitted through the buffer to the transmission channel.
Before the conversion into the one-dimensional array, the quantized DCT coefficients are decoded by an inverse quantizer 13 and an inverse DCT unit 14 and then stored into a first frame memory 15. At this time, the video data stored in the first frame memory 15 has a quantization error because it is reproduced through the quantization and the inverse quantization.
The video coding method in the P mode is adapted to compress a bit rate by removing the time redundancy of the video information. In the video coding method in the P mode, the DCT, inverse DCT, quantization and inverse quantization operations are the same, as those in the I mode. The motion estimation and the motion compensation are performed in the unit of macro block of 16.times.16 pixels. A motion estimator 18 compares the video data of the input block with video data of a previous frame from a second frame memory 17, not quantized. As a result of the comparison, the motion estimator 18 finds a motion vector positioning data analogous to the video data of the input block, in the second frame memory 17. A motion compensator 16 extracts a video data block corresponding to the motion vector, from a previous video frame stored in the first frame memory 15. Then, the extracted video data block is subtracted from the input block data and the resultant data is passed through the DCT unit 11 and the quantizer 12 to the transmission channel in the above-mentioned manner.
The video data and the motion vector transmitted in the P mode are combined with the previous video data by the decoder, resulting in production of a full picture.
The video coding method in the B mode is a compression method of reducing a prediction error by compensating for the motion using the previous and subsequent reference frames.
The operation of the encoder associated with the I, P and B modes is controlled by a mode selector 19.
The I, P and B-coded video data are combined to form a group of pictures (GOP) which can be directly accessed for the decoding with no previous video information. First coded video data must be the I-coded video data, which usually discriminates between the GOPs. It is preferred that the GOP has a small number of pictures so that it can be applied to the random access, a fast forward. operation, and fast and normal reverse operations.
The present invention relates to a method of allocating a bit amount properly to the I, P and B-coded video data forming the GOP. In accordance with the present invention, a scene variation is sensed and the bit amount of the I and P-coded video data is adjusted in accordance with the sensed result, so that the picture quality of the decoded video data can be enhanced.
However, a conventional bit allocation method is adapted to estimate previously a bit amount to be generated in the coding in consideration of an input video characteristic, a coding mode, a transmission capacity and a required picture quality.
FIG. 2 is a view illustrating conventional coding modes for an input picture string in the MPEG2 video encoder. Three P mode pictures are inserted between two I mode pictures, and two B mode pictures are inserted between I and P mode pictures or between two P mode pictures. As a result, 12 pictures constitute one group. Each of the arrows indicates an operational relation between the respective mode pictures.
Now, a bit allocation method for the transmission rate control in a conventional MPEG2 test model 5 (TM5) with the construction as shown in FIG. 2 will be described in detail.
In the MPEG2 TM5, a bit amount is allocated differently according to the coding modes. Generally, in order to obtain the same picture quality, more bit amount must be allocated in the order off I, P and B mode pictures.
First, complexities X.sub.I, X.sub.P and X.sub.B of the I, P and B mode pictures are defined as follows: EQU X.sub.I =S.sub.I .multidot.Q.sub.I ( 1) EQU X.sub.P =S.sub.P .multidot.Q.sub.P ( 2) EQU X.sub.B =S.sub.B .multidot.Q.sub.B ( 3)
S.sub.I, S.sub.P and S.sub.B are the number of bits of the just before coded video data of the respective modes and Q.sub.I, Q.sub.P and Q.sub.B are average step sizes of the quantizer used in those cases, respectively. Namely, the coding complexity is defined as a multiplication of the bits number and the average quantizer step size in coding the previous picture.
Allocated bit amounts T.sub.I, T.sub.P and T.sub.B of the I, P and B mode pictures are defined as follows: ##EQU1## K.sub.P and K.sub.B have values different according to quantization matrices, and are 1.0 and 1.4 in the TM5, respectively. R is the bit amount remaining after the coding up to the just before picture among bits allocated to the entire picture group. N.sub.P is the number of the remaining P mode pictures of the picture group except the coded pictures and N.sub.B is the number of the remaining B mode pictures of the picture group except the coded pictures. Channel transmission rate/(8.multidot.inputs per sec) indicates a minimum number of bits to be allocated to each picture.
The above equations (4) to (6) used for the bit allocation in the MPEG2 TM5 signify that the bit amount a is allocated to the respect;ire I, P and B mode pictures according to the complexities X.sub.I, X.sub.P and X.sub.B of the previously coded pictures of the same modes. However, when the input picture has a different characteristic due to the scene change, the complexities X.sub.I, X.sub.P and X.sub.B of the previously coded pictures become insignificant. Particularly, in the case where the scene change is present in the P mode picture, most of macro blocks of the P mode picture muse be coded in the I mode because the motion compensation is impossible. Generally, the required number of bits in the I mode coding is at least twice that in the P mode coding. For this reason, the picture quality is degraded due to the shortage in the bits number of the P mode picture having the scene change. Also, the quality degradation of the P mode picture may have a bad effect on the subsequent B or P mode picture since the p mode picture is used as a reference for the motion-compensation.