1. Field of the Invention
The present invention relates to a digital TV or digital video appliance, and more particularly, to a video transcoding apparatus converting a specific bit rate of an MPEG (moving pictures experts group) bit stream into a different rate thereof for transportation.
2. Background of the Related Art
Lately, encoders such as MPEG and the like are used for reducing storage and transmission capacity of a digital video or audio. Specifically, required are various applications such as video search, picture-in-picture (PIP), video coupling, video edition, transport bit rate conversion, and the like, in which video transcoding methods converting an MPEG bit stream having a specific bit rate are demanded so as to have another bit ratio. For instance, a bit stream of JPEG (joint photographic coding experts group) is converted into an MPEG bit stream, a DV (digital video) format as a digital output of a digital camcorder is converted into an MPEG bit stream, and an MPEG bit stream of HD (high definition) is converted into the MPEG bit stream of SD (standard definition).
FIG. 1 illustrates a block diagram of a general video transcoding apparatus.
Referring to FIG. 1, a video transcoding apparatus includes a decoding unit 10, a frame memory 20 storing an output of the decoding unit 10 for a video transcoding, an encoding unit 30 converting a bit rate of a video stored in the frame memory 20 into a different bit rate, and a bit rate control unit 50 controlling a bit rate of the encoding unit 30.
Namely, a variable length decoding (VLD) unit of the decoding unit 10 decodes an inputted video bit stream by VLD so as to divide the bit stream into a motion vector, a quantized value, a DCT (discrete cosine transform) coefficient, and then outputs the motion vector MV to a motion compensation unit 16 and the quantized value and DCT coefficient to an inverse quantized (IQ) unit 12. The IQ unit 12 inverse-quantizes the DCT coefficient in accordance with the quantized value, and then outputs the inverse-quantized value to an IDC unit 13. The IDCT unit 13 carries out IDCT on the inverse-quantized DCT coefficient so as to output the IDCT value to an adder 14. If the decoding unit 10 is a general MPEG-2 video decoder, the IDC unit 13 carries out the IDCT by an 8*8 block unit suitable for an MPEG-2 video syntax.
In this case, forms of pictures standardized by MPEG include I, P, and B pictures. Data restored by the IDCT unit 13 are the I pictures, which is a perfect picture able to be displayed intact. The data of the B or P picture are an imperfect picture requiring a motion compensation through the motion compensation unit 16.
Namely, by taking the I picture as a reference, the motion vector representing a motion is regarded as ‘0’. When it is the B or P picture, the original image should be restored using the previous picture stored in a memory unit 15. In this case, the motion vector means a 2-dimensional vector representing an offset of a coordinate of a field taken as a reference frame from a current picture or a field coordinate for the motion compensation.
Therefore, the motion vector outputted from the VLD unit 11 is outputted to the motion compensation unit 16. The motion compensation unit 16 carries out the motion compensation for a present pixel value using the motion vector and the previous frame stored in the memory 15, and then outputs the result to the adder 14. Namely, the motion compensation unit 16 predicts one direction or bi-directions using the motion vector of the present B or P picture outputted from the VLD unit 13 and the previous picture stored in the memory 15, thereby restoring the B or P picture into a perfect video.
The adder 14 restores the perfect video as a final pixel value by adding the IDCT (inverse discrete cosine transform) value to the motion compensation value, and then stores it in the memory for the motion compensation and the frame memory 20 for the video transcoding. Namely, the IQ/IDCT result is directly stored in the memories 15 and 20 for the I picture. But, the compensation data and IDCT result are added together by the adder 14 for the P or B picture, and then stored in the memories 15 and 20.
In this case, in order to convert the video stored in the memory 20 into a bit stream having a low transport bit rate and store the bit stream in a storage device such as a hard disk, a video encoder such as the encoding unit 30 is used.
Namely, if data outputted from the frame memory 20 is the I picture, a subtracter 31 in the encoding unit 30 outputs the data to a DCT unit 32 as it is. But, if the data outputted from the frame memory 20 is the P or B picture, the adder 31 outputs a differential data to the DCT unit 32. The differential data results from the data of which motion is compensated in the motion compensation unit 39. The DCT unit 32 then carries out DCT on the inputted data, and outputs the DCT data to a quantizing unit 33 for quantization.
In such a procedure, the DCT unit 32 removes a relationship of the data through a 2-dimensional axis transformation, in which a picture is divided into block units each of which axis is transformed in accordance with the DCT method. The axis-transformed data tend to be driven into one direction (toward low pass). The quantizing unit 33 quantizes the driven data only with a predetermined quantizing interval, and then outputs the quantized data to a VLC (variable length coding) unit 34. The VLC unit 34 represents a frequent value by a low number of bits and a rare value by a high number of bits, thereby reducing total bit number.
In this case, the data on which VLC is carried out in the VLC unit 34 is outputted to a buffer 40. The buffer 40 stores the VLC data temporarily, outputs the VLC data to the storage device such as hard disk at a constant speed, and outputs the VLC data to the bit rate control unit 50 by calculating a fullness of the buffer 40.
Namely, the MPEG bit stream of a specific bit rate is transformed into that of a different bit rate such as a low transport bit rate using the decoding and encoding units 10 and 30, and then stored in the storage device.
Moreover, the DCT coefficient quantized by the quantizing unit 33 is inputted to the IQ unit 35 again for inverse quantization, and then outputted to the IDCT unit 36. The IDCT unit 36 carries out IDCT on the inverse-quantized DCT coefficient, and then outputs the IDCT coefficient to the adder 37. The adder 37 adds the IDCT value to the motion compensated value so as to restore a perfect video as a final pixel value, and then stores the added value in a memory 38 for the motion compensation. The motion compensation unit 39 carries out the motion compensation using the previous frame read from the memory 38, and then outputs the motion-compensated value to the subtracter 31 and the adder 37.
As mentioned in the above explanation in FIG. 1, a specific bit rate of the MPEG bit stream is converted into a different bit rate such as a low transport bit rate using the decoding and encoding units 10 and 20, and the result is stored in the storage device such as a hard disk.
A bandwidth of a HDTV transmission channel is fixed, while a generated data amount varies in accordance with time since video data are variable-length-coded (VLCed) finally. In order to adjust the generated data amount to keep up with a given transmission rate, the bit rate control unit 50 is required. The bit rate control unit 50 varies a step size of the quantizing unit 33 mainly in accordance with a fullness of the buffer 40 so as to control the generated data amount. Namely, as the data amount filling the buffer 40 increases if a generated bit number is higher than a reference, a following bit number is reduced by increasing a quantizing step size. If the generated data amount is lower than the reference, the quantizing step size is reduced so as to increase the generated bit number. Thus, a state of the buffer 40 is controlled so as to maintain a predetermined value overall.
In this case, when referring to MPEG-2 statements (test model 5, file No. AVC-491) in progress of standardization by a subordinate organization, IS/IEC JTC/SC29/WG11, of ISO (international organization for standardization), the bit rate control unit 50 carries out the following three steps.
A first step predicts a complexity and allocates a target bit. Namely, a predetermined bit rate is allocated by GOP (group of pictures) unit in accordance with a transport bit rate, and bits to be allocated in GOP are allocated in accordance with the complexity of each of the pictures (I, P, and B frames). In this case, each complexity X of the I, P, and B pictures after encoded is attained by the following formula 1.
[Formula 1]Xi=SiQiXp=SpQpXb=SbQb,where Si, Sp, and Sb are bit amounts generated after the I, P, and P pictures are encoded, and Qi, Qp, and Qb are average values of quantizing parameters used for encoding all the macro blocks of the respective pictures, respectively. And, initial complexities are given as Xi=160*bit rate/115, Xp=60*bit rate/115, and Xb=42*bit rate/115, where each of the bit rates is found by ‘bit number/second’.
Namely, target bits Ti, Tb, and Tp of the I, P, and B pictures to be encoded in accordance with the bit rates of the I, P, and B pictures as video transcoded forms, respectively, are allocated by the following formula 2.
                                          T            i                    =                      max            ⁢                          {                                                R                                      1                    +                                                                                            N                          p                                                ⁢                                                  X                          p                                                                                                                      X                          i                                                ⁢                                                  X                          p                                                                                      +                                                                                            N                          b                                                ⁢                                                  X                          b                                                                                                                      X                          i                                                ⁢                                                  X                          b                                                                                                                    ,                                  bit_rate                                      8                    ×                    picture_rate                                                              }                                      ⁢                                  ⁢                              T            p                    =                      max            ⁢                          {                                                R                                                            N                      p                                        +                                                                                            N                          b                                                ⁢                                                  K                          p                                                ⁢                                                  X                          b                                                                                                                      K                          b                                                ⁢                                                  X                          p                                                                                                                    ,                                  bit_rate                                      8                    ×                    picture_rate                                                              }                                      ⁢                                  ⁢                              T            b                    =                      max            ⁢                          {                                                R                                                            N                      b                                        +                                                                                            N                          p                                                ⁢                                                  K                          b                                                ⁢                                                  X                          p                                                                                                                      K                          p                                                ⁢                                                  X                          b                                                                                                                    ,                                  bit_rate                                      8                    ×                    picture_rate                                                              }                                                          [                  Formula          ⁢                                          ⁢          2                ]            
In the Formula 2, Kb and Kp are constants dependent on a quantizing matrix, where Kp=1.0, Kb=1.4, and R is a bit number of the remaining allocated bits after encoding the previous picture. And, bit_rate is a channel transmission rate (bit/sec) and picture_rate is a number of pictures decoded per second. R (bit rate) is given as ‘0’ when GOP initiates.
R becomes R+GOP_target every GOP, and then updated by a value found by subtracting bit amount generated every GOP from R.
In this case, G=bit_rate*N/picture_rate, N is a size of GOP, and Np and Nb are numbers of P and B pictures respectively to be encoded in the present GOP.
The second step controls the transmission rate, i.e., bit rate, in which a reference quantizing parameter for each macro block is calculated in accordance with the fullness of the virtual buffer 40. And, the bit rate is adjusted so that each picture is encoded to be suitable for the bit allocated by the first step.
In this case, it is assumed that each picture has a random virtual buffer, and a method of adjusting the quantizing parameter in accordance with a status of the buffer.
The third step is an adaptive quantizing step. In the third step, activity of a macro block to be encoded currently is found to be normalized, and a quantizing parameter to be used substantially for the quantization is found by multiplying the reference quantizing parameter in the second step by the normalized activity. Namely, the adaptive quantization enables to increase a subjective image quality, in which the reference quantizing parameter is varied in accordance with the complexity of the current macro block.
Namely, the first and second steps, which calculate the bit allocation and the fullness of buffer, are performed through the buffer 40 and the reference quantizing parameter calculating unit 51, and the third step of carrying out the adaptive quantization is performed by an activity calculating unit 52 and a quantizing parameter generating unit 53.
In this case, the bit rate control unit 50 carries out effectively the bit allocation and bit rate control so as to attain the numbers and structures of the I-, P-, and B-pictures inside the GOP structure in the encoder 30.
A real-time video transcoder needs to encode the inputted bit stream immediately, whereby information of the currently encoded picture is acquired. Yet, the real-time video transcoder fails to recognize the GOP structure or picture_coding_type of the following picture. Hence, if the order or number of the P-pictures or the B-pictures is changed irregularly, it is difficult to control the bit rate of the encoder 30 so as to degrading a video quality.
Moreover, the video transcoder shown in FIG. 1 brings about a loss of the video quality in the process of making a transmission rate lower than the transmitted bit rate.
In order to save the video quality loss, used are methods such as ‘bit amount reduction’ removing a RF AC coefficient from an MPEG decoder, ‘bit rate variance’ changing a bit rate through re-quantization in an MPEG decoder, and ‘cascaded transcoding’ connecting simply MPEG decoder and encoder each other.
Yet, the ‘bit amount reduction’ removing a RF AC coefficient grasps a boundary between a bit length and a sign by parsing a bit column only so as to remove the DCT coefficient at an exceeding position by adjusting a target bit amount by macro block unit. Therefore, the ‘bit amount reduction’ has a simple structured hardware. But, the DCT coefficient is removed so as to generate a drift error. Hence, the ‘bit amount reduction’ degrades the video quality as well.
The method using re-quantization caries out inverse-quantization after VLD and applies a wider quantizing width to a quantization step again. Therefore, such a method has a video quality superior to that of the ‘bit amount reduction’, but increases a complexity of hardware.
And, the ‘cascaded transcoding’ is excellent in video quality, but the cascades transcoder has a built-in MPEG-2 encoder. Therefore, the ‘cascaded transcoding’ has a complicated hardware and carries out lots of calculation.
Namely, the storage device for high speed play or long time record in a digital VCR or the like has a relatively small record space, whereby considerable portions of data are cut from the received MPEG bit stream for record. Moreover, if a record time is extended twice longer in VCR record, a bit rate of the bit stream should be reduced to half. Home applications prefer a simple hardware degrading a quality to a complicated one costing much. Therefore, the home applications use the method of removing the RF AC coefficient or using re-quantization, or the like. Moreover, the cascaded transcoder removes the drift error through a motion compensation circuit so as to maintain a good video quality. Therefore, the ‘cascades transcoding’ is used for a VOD (video on demand) server, broadcasting station, or the like.
However, such a method requires massive calculation for determining new macro block determining mode, motion compensating mode, and the like.