1. Field of the Invention
This invention relates to a picture data processor, picture data decoder and picture data encoder, and methods thereof, and more particularly, is preferably applicable to the case of decoding and reencoding picture data encoded by the moving picture experts group (MPEG) standard.
2. Description of the Related Art
In a picture data encoder for encoding a motion picture based on, for example, the moving picture experts group (MPEG) standard, encoding is performed with, for example, fifteen frames of motion picture data as one processing unit called a group of pictures (GOP).
In one GOP, there are coding types for each frame, called I-picture (intra-picture: intra-frame coded picture), P-picture (predictive-picture: inter-frame forward direction predictive-encoded picture) and B-picture (bidirectional predictive-picture: bidirectionally predictive-coded picture).
Specifically, as shown in FIG. 1, the I-picture (picture IO) is to keep independence of the GOP, and is encoded in the picture. The P-pictures (pictures P3, P6) are predictive-encoded in the forward direction from the I-picture or P-picture. In this connection, the I-picture and P-picture are encoded in the same sequence as an original picture.
The B-pictures (pictures B1, B2, B4, B5) are bidirectionally predictive-encoded from the I-picture or P-picture. Accordingly, when decoding picture data compressively-encoded, the I-picture is solely decoded, but the P-picture and B-picture its picture data are not decoded solely.
FIG. 2A shows the input order of a motion picture signal to an encoder (that is, the displaying order of the motion picture signal). The first frame picture signal (picture IO) is a picture signal before being encoded as I-picture, and the following second frame picture signal (picture B1) is a picture signal before being encoded as B-picture. In this connection, a number added to each picture type (I, P, B) represents the displaying order.
The motion picture signal successively inputted to the encoder in this manner is encoded according to each picture type. In this case, since the B-pictures (pictures B1 and B2) are generated referring to the I-picture (picture IO) and P-picture (picture P3), the B-pictures (pictures B1 and B2) are stored in a frame memory until the P-picture (picture P3) to be a reference is encoded.
As the above, the encoder needs frame memories as many as the B-pictures between the I-picture (or P-picture) and P-picture. In the case of FIGS. 2A to 2D, there is a space of three pictures between the I-picture (or P-picture) and P-picture. If representing this as M=3, the number of pieces of B-pictures between the I-picture (or P-picture) and P-picture becomes M-1.
Furthermore, if encoding for each picture type is performed in the encoder, each picture is to be outputted in a sequence shown in FIG. 2B. In this case, since the B-pictures (pictures B1 and B2) are temporarily stored in the frame memory before being encoded and the P-picture (picture P3) is previously encoded, coded data (bit stream) is outputted at timing delayed for three frames from the picture IO to the picture P3 of the input picture signal (FIG. 2A) from the input of the picture signal.
Therefore, in the encoder, the inputted motion picture signal is outputted in the state where its frame sequence (the displaying order) has been rearranged in order of encoding by that encoding processing.
Thus coded bit stream (FIG. 2B) is inputted to a decoder through a prescribed transmission line and decoded. In this case, as shown in FIG. 2C, pictures are inputted in order of pictures outputted from the encoder as described above with reference to FIG. 2B.
In the decoder, the inputted bit stream is rearranged in order of display to obtain a motion picture signal shown in FIG. 2D, and outputs this as a decoded picture signal. In this case, in the decoder, since the inputted bit stream (FIG. 2C) is not in order of display, a process to rearrange the inputted bit stream in order of display is necessary. Accordingly, in the decoder, a delay time for one frame is generated after the bit stream was inputted until this is outputted as the decoded picture signal.
By the way, when changing the bit rate of a bit stream once encoded, or the like, it is needed to temporarily decode the encoded bit stream and reencode the decoded picture signal at a different bit rate. As a method of reencoding thus decoded picture signal, a decoding and encoding apparatus 1 as shown in FIG. 3 can be considered.
Referring to FIG. 3, the decoding and encoding apparatus 1 temporarily decodes an encoded bit stream D1 with a decoder 2, and reencodes a decoded picture signal D10 thus decoded with an encoder 20, to obtain a bit stream D34.
As shown in FIG. 4, this decoder 2 inputs the inputted bit stream D1 to a variable-length decoding part 4 via a buffer 3. The variable-length decoding part 4 performs variable-length decoding on the bit stream D1 and supplies this to an inverse quantizing part 5. The inverse quantizing part 5 performs inverse quantizing on the output of the variable-length decoding part 4 to restore a discrete cosine transform (DCT) coefficient sequence D5. This is subjected to inverse DCT processing in an inverse DCT part 6. Thus, difference data according to the picture type is outputted from the inverse DCT part 6 to an arithmetic part 7.
Here, when decoding the I-picture first inputted as the bit stream D1, since the I-picture is data intra-frame-encoded, picture data for one frame is outputted from the inverse DCT part 6. This picture data is supplied to the following picture sequence rearranging part 10 as decoded picture data D7 as well as being stored in a frame memory 8 as reference picture data.
A motion compensating part 9 performs motion compensating on the reference picture data stored in the frame memory 8 based on motion vector information (not shown in figure) transmitted from the encoder along with the bit stream D1, and supplies this to the arithmetic part 7 as predictive picture data D9.
The arithmetic part 7 adds the difference data D6 outputted from the inverse DCT part 6 to the predictive picture data D9, and obtaining the decoded picture data D7 of a new frame (picture). This decoded picture data D7 is supplied to the picture sequence rearranging part 10, and at the same time, it is stored in the frame memory 8 as reference picture data to be a reference picture for the following frame (picture).
In this connection, when the bit stream D1 shown in FIG. 5A is inputted to the decoder 2, the picture P3 is restored with the picture IO previously-decoded as a reference picture, and the pictures B1 and B2 are restored with the pictures IO and P3 previously-decoded as reference pictures.
Thus, each of pictures forming the bit stream D1 is decoded and then rearranged in order of display in the picture sequence rearranging part 10, so that the decoded picture signal D10 shown in FIG. 5B can be obtained. This decoded picture signal D10 is outputted from the picture sequence rearranging part 10 of the decoder 2, and supplied to the encoder 20 (FIG. 3).
In this connection, in the variable-length decoding part 4 of the decoder 2, the picture type (I-picture, P-picture, B-picture) of each of the pictures forming the inputted bit stream D1 is read out from header information corresponding to each picture, and this is supplied to the encoder 20 (FIG. 3) as picture coding type information D40. The encoder 20 encodes the decoded picture signal D10 into the same picture type as the picture type encoded in the bit stream D1 based on the picture coding type information D40.
FIG. 6 shows the configuration of the encoder 20. The decoded picture signal D10 outputted from the decoder 2 (FIG. 4) is stored in a frame memory 21 of the encoder 20. A motion predicting part 22 detects motion vector information D21 between two frames (pictures) stored in the frame memory 21, and supplies this to a motion compensating part 31.
The motion compensating part 31 performs motion compensating processing on the reference pictures stored in a frame memory 30 at this time using the motion vector information D21, and generating predictive picture data D31, supplies this to an arithmetic part 24.
On the other hand, the picture data of each picture of the decoded picture signal D10 in which the motion vector has been detected is supplied to a picture sequence rearranging part 23. The picture sequence rearranging part 23 rearranges pictures of the decoded picture signal D10 shown in FIG. 5C, for example, so as to form such a picture sequence that the I-picture (picture IO) and P-picture (picture P3) being the reference pictures are encoded before the B-pictures (pictures B1 and B2).
The decoded picture signal D23 in which the pictures have been rearranged in this manner is supplied to the arithmetic part 24, and a difference from the predictive picture data D31 supplied from the motion compensating part 31 is calculated. This difference data D24 is supplied to a DCT part 25 to be subjected to discrete cosine transform (DCT) processing. The DCT part 25 generates a DCT coefficient sequence D25 by the DCT processing, and supplies this to a quantizing part 26. The quantizing part 26 quantizes the DCT coefficient sequence D25 to generate quantized data D26, and supplies this to a variable-length encoding part 33 and an inverse quantizing part 27.
The inverse quantizing part 27 performs inverse quantizing on the quantized data D26 to restore a DCT coefficient sequence D27. The DCT coefficient sequence D27 is supplied to an inverse DCT part 28 to be subjected to inverse DCT processing. Thus, the inverse DCT part 28 restores difference data D28 according to the picture type, and outputs this to an arithmetic part 29.
The arithmetic part 29 performs addition calculation of the predictive picture data D31 outputted from the motion compensating part 31 to the difference data D28, and generating reference picture data D29, stores this in a frame memory 30.
In this manner, the difference data quantized via the DCT part 25 and quantizing part 26 is restored by the inverse quantizing part 27 and the inverse DCT part 28 as the difference data D28. This is added to the predictive picture data D31 in the arithmetic part 29, so that it is to be the reference picture data D29. Thereby, a reference picture for the following frame (picture) is prepared in the frame memory 30.
Here, the encoder 20 inputs the picture coding type information D40 from the decoder 2 (FIG. 4), and encodes each frame (picture) of the decoded picture signal D10 according to each picture type specified by the picture coding type information D40. That is, each frame of the decoded picture signal D10 outputted from the decoder 2 has been encoded according to any type of the I-picture, P picture and B-picture, before being decoded in the decoder 2. This picture type is inputted to the encoder 20 as the picture coding type information D40 along with a corresponding frame number in the decoding processing in the decoder 2.
Thus, the encoder 20 obtains the picture type information corresponding to the frame of the inputted decoded picture signal D10, and performs encoding according to the picture type. For example, if the picture type specified to the frame of the inputted decoded picture signal D10 is the I-picture, the control part of the encoder 20 (not shown in figure) performs encoding processing on each of macro blocks forming the frame of the decoded picture signal D10 in an intramode.
Specifically, in the intramode, the arithmetic part 24 transmits the decoded picture signal D23 outputted from the picture sequence rearranging part 23 as it is. Accordingly, the difference data D28 outputted via the DCT part 25, quantizing part 26, inverse quantizing part 27 and inverse DCT part 28 becomes picture data for one frame (picture). This is added to the predictive picture data D31 supplied from the motion compensating part 31, and stored in the frame memory 30 as reference picture data.
At this time, the quantized data D26 of the I-picture outputted from the quantizing part 26 is supplied to the variable-length encoding part 33. The variable-length encoding part 33 performs the variable-length encoding processing using a prescribed conversion table on the quantized data D26, and generating variable-length-coded data D33, supplies this to a buffer 34. The buffer 34 outputs the variable-length coded data D33 at a prescribed bit rate.
Thus, a bit stream D34 at the prescribed bit rate is outputted from the encoder 20. As a result, the decoded picture signal D10 inputted to the encoder 20 is reencoded into the picture type (I-picture) before being decoded in the decoder 2, provided prior to the encoder 20, and the reencoded data is outputted.
On the other hand, if the picture type specified to the decoded picture signal D10 to be inputted to the encoder 20 is the P-picture, the control part of the encoder 20 (not shown in figure) performs the encoding processing on each of the macro blocks forming the frame of the decoded picture signal D10 in a forward-directional predictive mode.
In the forward-directional predictive mode, the arithmetic part 24 performs subtraction processing on the decoded picture signal D23 outputted from the picture sequence rearranging part 23, using the forward-directional predictive picture data D31 supplied from the motion compensating part 31. The forward-directional predictive picture data D31 is calculated by that the reference picture data composed of the I-picture or P-picture stored in the frame memory 30 is subjected to the motion compensating based on the motion vector information D21. In this manner, the arithmetic part 24 calculates a difference between the forward-directional predictive picture data D31 generated based on the I-picture (or P-picture) stored in the frame memory 30 and the decoded picture signal D23, and generating difference data forming the P-picture, transmits this to the DCT part 25.
At this time, the quantized data D26 of the P-picture outputted from the quantizing part 26 is outputted via the variable-length encoding part 33 and the buffer 34 as the bit stream D34. Thereby, the decoded picture signal D10 to be inputted to the encoder 20 is reencoded into the picture type (P-picture) before being decoded in the decoder 2 provided prior to the encoder 20, and the reencoded data is outputted.
On the other hand, if the picture type specified to the decoded picture signal D10 to be inputted to the encoder 20 is the B-picture, the control part of the encoder 20 (not shown in figure) performs the encoding processing on each of the macro blocks forming the frame of the decoded picture signal D10 in a bidirectional predictive mode.
In the bidirectional predictive mode, the arithmetic part 24 performs subtraction processing on the decoded picture signal D23 outputted from the picture sequence rearranging part 23, using the bidirectional predictive picture data D31 supplied from the motion compensating part 31. The bidirectional predictive picture data D31 are calculated by defining pictures in the past and future (I-picture or P-picture) with respect to a frame to be encoded at this time as reference pictures and performing the motion compensating on the reference picture data stored in the frame memory 30 based on the motion vector information D21. Then, the arithmetic part 24 subtracts the mean value of each predictive picture data in the forward direction and reverse direction generated based on the past I-picture or P-picture and the future I-picture or P-picture stored in the frame memory 30, and obtaining the difference data D24 as a predictive residual, and transmits this to the DCT part 25.
At this time, the quantized data D26 of the B-picture outputted from the quantizing part 26 is outputted via the variable-length encoding part 33 and buffer 34 as the bit stream D34. Thereby, the decoded picture signal D10 inputted to the encoder 20 is reencoded into the picture type (B-picture) before being decoded in the decoder 2 provided prior to the encoder 20, and the reencoded data is outputted.
In this manner, after each of the pictures forming the bit stream D1 (FIG. 4) was temporarily decoded in the decoder 2, if it is reencoded in the encoder 20, it becomes the same picture type, and the deterioration of image quality owing to reencoding can be avoided.
By the way, in the decoding and encoding apparatus 1 shown in FIG. 3, for example, if the bit stream D1 shown in FIG. 5A is inputted to the decoder 2, all spaces between the I-picture (or P-picture) and the P-picture of this bit stream D1 is unified into M=3 in order of display. Therefore, the decoded picture signal D10 in the displaying order obtained by decoding the bit stream D1 is in the structure that two pieces of B-pictures (pictures B1 and B2) are inserted between the picture IO and the picture P3, as shown in FIG. 5B.
If the decoded picture signal D10 having such structure is inputted to the encoder 20 as shown in FIG. 5C, the encoder 20 encodes this and outputs the bit stream D34 delayed for three frames based on M=3, as shown in FIG. 5D. This bit stream D34 is encoded based on the picture coding type information D40, so that it has the same picture structure as the bit stream D1 inputted to the decoder 2.
On the other hand, as the bit stream D1 to be inputted to the decoder 2 of the decoding and encoding apparatus 1 shown in FIG. 3, if the GOP structure of the bit stream is changed by editing or if two bit streams different in the GOP structure have been connected is inputted, the space between the I-picture (or P-picture) and the P-picture of the inputted bit stream D1 (i.e., the value of M) may change on the way.
That is, if the edited bit stream D1 shown in FIG. 7A is inputted to the decoder 2, the decoder 2 outputs a decoded picture signal D10 shown in FIG. 7B. In this decoded picture signal D10, the space between the I-picture (or P-picture) and the P-picture (i.e., the value of M), among the pictures aligned in order of display has been changed from M=3 into M=5 on the way, by prescribed editing before being inputted to the decoder 2.
If such decoded picture signal D10 is inputted to the encoder 20 (FIG. 6), (FIG. 7C), the encoder 20 encodes each picture of the decoded picture signal D10 according to any type of the I-picture, P-picture and B-picture based on the picture coding type information D40 supplied from the decoder 2, as described above with reference to FIGS. 4 and 6.
As a result, the bit stream D34 shown in FIG. 7D is outputted at a timing delayed for three frames from the input signal (decoded picture signal D10) from the encoder 20. At this time, in the encoder 20 for encoding a picture signal having the GOP structure with M=3, there is a problem that the capacity of a memory (picture sequence rearranging part 23) for storing the B-picture lacks when the GOP structure of the picture signal is changed into M=5.
That is, in the case of FIG. 7D, the encoder 20 needs to rearrange each picture so that the P-picture (picture P8) follows two B-pictures (pictures B1 and B2) with respect to the inputted decoded picture signal D10, encode the P-picture (picture P8) which is to be the reference of four B-pictures (pictures B4, B5, B6 and B7) in M=5 prior to these four B-pictures (pictures B4, B5, B6 and B7), and store them in the frame memory 30 (FIG. 6) as reference pictures.
As the above, when performing such rearranging that the P-picture (picture P8) is moved before the four B-pictures (pictures B4, B5, B6 and B7), the picture sequence rearranging part 23 of which the memory is insufficient to store the four B-pictures is broken.
As one of methods to solve the above problem, a method of increasing the memory capacity of the picture sequence rearranging part 23 can be considered. In this case, however, if the header of the GOP of the decoded picture signal D10 (FIG. 7C) is M=3 but it is changed into M=5 on the way, the encoder 20 cannot predict the change of the GOP structure. Therefore, the bit stream D34 (FIG. 7D) is started to be outputted in the state where it delays for three frames from the input of the decoded picture signal D10. However, there is a problem that if the encoder 20 tries to output the picture P8 succeeding the output of the picture B2 based on the picture coding type information D40 (FIG. 6) supplied from the decoder 2, the encoder 20 cannot encode and output the picture P8 because the output of the bit stream D34 has been started already at this time with the three frames delay.
Moreover, according to the reencoding method described above with reference to FIGS. 4 and 6, there is a problem that a delay time owing to reencoding becomes longer because the delay time that is the sum of a delay time in the decoder 2 (one frame delay) and a delay time in the encoder 20 (three frame delay, in case of M=3) is generated.