1. Field of the Invention
The present invention relates to the art of video image information compression and transmission, and particularly to a transcoding method and transcoding apparatus for converting coded video image data into coded video image data complying with different data rates.
2. Background Art
Multimedia which express different types of information such as text, pictures, audio and video as digital data and combine these media to handle them integrally have garnered much attention in recent years. As audio-video coding formats which support multimedia, there is the ISO/IEC's MPEG (Moving Picture Experts Group) 1 and the like, and various video image coding and transmission systems which are compliant therewith have been offered.
FIGS. 24A and 24B show general structures for this type of video image coding and transmission system, FIG. 24A being a block diagram showing the structure of a transmission side encoder and FIG. 24B being a block diagram showing the structure of a reception side decoder.
The transmission side encoder, as shown in FIG. 24A, comprises a subtractor 101, a DCT (Discrete Cosine Transform) portion 102, a quantizer 103, a dequantizer 104, an inverse DCT portion 105, a motion prediction and compensation portion 106, variable-length encoders 107 and 108, and a multiplexing device 109.
The reception side decoder, as shown in FIG. 24B, comprises a demultiplexing device 201A, variable-length decoders 201 and 206, a dequantizer 202, an inverse DCT portion 203, an adder 204 and a motion compensation portion 205. The various components of the encoder and decoder listed above can be constructed of exclusive hardware, or else constructed from a DSP (digital signal processor) or the like performing predetermined programs.
In the structures shown in FIGS. 24A and 24B, image information of a picture group composed of a predetermined number of frames are supplied sequentially to the encoder. The encoder codes the image information in units of these picture groups. One I frame must be contained in each picture group, in addition to which one or more P frames or B frames may be included. Here, the I frame is a frame which is the object of so-called intraframe coding. Additionally, a P frame is a frame in which coding and decoding are performed by referencing the image of a frame before that frame, and a B frame is a frame in which coding and decoding are performed by referencing the frames before and after that frame.
FIG. 25 shows examples of operations of the encoder in a case wherein the frames constituting a picture group are sequentially supplied. In this FIG. 25, an example of a case where the I frames and P frames are inputted to the encoder with the P1 frame and P2 frame following the I frame is given for simplicity of explanation. Herebelow, the operations of the encoder shall be explained with reference to FIG. 25.
First, when image information corresponding to an I frame is inputted, processing is not performed by the various constituent elements of the encoder in FIG. 24A, but intraframe coding of image information of the current image (I frame) is performed by means of a predetermined coding algorithm, and the resulting coded data are transmitted to the reception side. Additionally, in the encoder, the image information of the I frame is decoded from the coded data in accordance with a decoding algorithm corresponding to the above-described coding algorithm, and stored in a memory (not shown) in the motion prediction and compensation portion 106 as a reference image.
Next, when the P1 frame is inputted, the current image (P1 frame) is divided into a plurality of macroblocks MBij (i=1-M, j=1-N) in the encoder. Here, each macroblock is composed of 2×2=4 blocks, and each block is composed of 8×8=64 pixels. Then, in the encoder, the macroblocks MBij are processed in the following way.
First, the motion prediction and compensation portion 106 searches a reference macroblock MBij′ of the same size and identical to the macroblock MBij of the current image from among the reference images (in this case, the I frame). It assumes that this reference macroblock MBij′ has moved to form the macroblock Mbij. The motion prediction and compensation portion 106 outputs motion information V which represents its spatial distance and direction of movement. Here, the motion information V is converted by the variable-length encoder 206 into a variable-length code.
The subtractor 101 subtracts the image information of the reference macroblock MBij′ from the image information of the macroblock MBij to determine the difference between the images, and the DCT portion 102 performs a DCT which is a type of orthogonal transform on this difference.
The quantizer 103 quantizes the DCT coefficient of the difference image obtained from the DCT portion 102, and the variable-length encoder 107 converts the data obtained from this quantization to a variable-length code.
The variable-length code of the quantized DCT coefficient and the variable-length code of the above-described motion information V are multiplexed by the multiplexing device 109, and transmitted to the reception side as coded data corresponding to the macroblock MBij.
On the other hand, the output data from the quantizer 103 are dequantized by the dequantizer 104, then the output data of the dequantizer 104 are inputted to the inverse DCT portion 105. As a result, a difference image Δ is outputted from the inverse DCT portion 105. While this difference image Δ is image information corresponding to the difference between the macroblock MBij of the current image (P1 frame) and the reference macroblock MBij′, it is generated through the processes of DCT, quantization, dequantization and inverse DCT and as such includes errors associated therewith.
The motion prediction and compensation portion 106 restores the image information of the macroblock MBij in the current image (frame P1) by means of methods such as adding the difference image Δ obtained from the inverse DCT portion 105 with the reference macroblock MBij′, then stores this in the memory as a reference image as a reference for coding of subsequent frames.
The above-described processes are performed on all of the macroblocks MBij (i=1-M, j=1-N) constituting the current image (P1 frame).
Then, when the next frame P2 is inputted, the reference image (image of the frame P1) stored in the memory of the motion prediction and compensation portion 106 is referenced to perform a coding process similar to that described above. The same applies to all frames subsequent to the frame P2.
FIG. 26 gives examples of the operations of the decoder upon receiving coded data of the I frame, P1 frame, P2 frame, . . . . transmitted from the encoder as descried above. Herebelow, the operations of the decoder shall be explained with reference to FIG. 12.
First, when the intraframe coded data of the I frame is received, the decoder of FIG. 24B does not perform processing by the various constituent elements that are shown, but decodes the intraframe coded data in accordance with a decoding algorithm corresponding to the intraframe coding algorithm on the encoder side. As a result, the same I frame image information as that stored in the memory of the motion prediction and compensation portion 106 on the encoder side is decoded, and this is stored as a reference image in a memory (not shown) in the motion compensation portion 205 in the decoder.
Next, the coded data of the P1 frame is inputted to the decoder. This coded data contains the following information corresponding to each of the plurality of macroblocks MBij (i=1-M, j=1-N) obtained by dividing the image of the P1 frame.
a. A variable-length code obtained by DCT quantization and variable-length coding of the difference between the relevant macroblock MBij and the reference macroblock MBij′ in the reference image (frame I) which is similar thereto.
b. A variable-length code of the motion information V indicating the motion vector from the reference macroblock MBij′ to the relevant macroblock MBij.
In the decoder, the variable-length codes of the above-mentioned a and b are separated by the demultiplexing device 201A, and returned to actual numerical values by means of the variable-length decoders 201 and 206. Then, in accordance with this information, the following processes are performed for each macroblock MBij.
First, a difference image Δ between the macroblock MBij and the reference macroblock MBij′ in the reference image (frame I) is restored from the actual numerical values obtained from the variable-length code of the above-mentioned a, by means of the dequantizer 202 and the inverse DCT portion 203.
Additionally, in the motion compensation portion 205, the location of the reference macroblock MBij′ in the reference image (frame I) corresponding to the relevant macroblock MBij is determined in accordance with the motion information V described in b above, and the image information of this reference macroblock MBij′ is read from the internal memory (not shown). Then, the image information of the reference macroblock MBij′ and the above-described difference image Δ are added by the adder 204, and the image information of the macroblock MBij is restored.
The above procedures are performed in all of the macroblocks MBij (i=1-M, j=1-N) to restore the entire image of the frame P1. The restored image of this frame P1 is stored in the memory of the motion compensation portion 205 as a reference image.
Then, when the coded data corresponding to the next frame P2 is received, the reference image (image of frame P1) stored in the memory of the motion compensation portion 205 is referenced in performing the same decoding processes as described above. The same applies for cases of receiving coded data corresponding to other frames subsequent to the frame P2.
Recently, the coded transmission of video images in various types of communication systems has been considered. For this reason, coded video image data are generated under the assumption of transmission at a certain transmission rate, but there may be cases in which the coded data must be transmitted at a transmission rate different from the original plan.
In this type of case, the number of frames per picture group must be reduced to decrease the data rate of the coded data. As a technology for achieving this, there is transcoding, which is the conversion of the format of coded data. FIG. 27A shows a method for transcoding, and FIG. 27B is a block diagram showing the structure of a conventional transcoding apparatus for transcoding.
As shown in FIG. 27B, the conventional transcoding device has a structure combining a decoder 100 with the same structure as that shown in FIG. 24B, and an encoder 200 with the same structure as that shown in FIG. 24A.
With this transcoding apparatus, as shown in FIG. 27A, the coded data generated by a first coding method is decoded by the decoder 100, and the image obtained by this decoding is coded according to a second coding method by the encoder 200. By recoding in this way, it is possible to generate coded data having a data rate different from the original coded data.
However, since the conventional transcoding method described above decodes an original image from coded data and generates coded data with a different data rate by recoding this image, it is heavy in processing and inefficient, and there is the problem of the picture quality falling due to conversion errors associated with the decoding and recoding of coded data.