In recent years, a distribution method generally called streaming has started to spread as a method of distributing motion picture data by using the Internet, wherein motion picture data which is obtained while an object is shot with a video camera or the like is sent to a user's personal computer or the like via the Internet so as to show motion picture based on the motion picture data in real time.
As for such a distribution method by streaming, a data transfer rate of the Internet is relatively low, and so a motion picture encoding apparatus to which a compression encoding method called MPEG2 (Moving Picture Experts Group phase 2) is applied, for instance, is provided on a sender side.
Now, the MPEG2 standard is standardized by organizations such as ISO/IEC JTC1/SC2/WG11 (International Organization for Standardization/International Electrotechnical Commission Joint Technical Committee/Sub Committee 2/Working Group 11), that is, standardized by adopting a hybrid encoding method which is a combination of motion compensation predictive encoding and discrete cosine transformation (DCT).
And the MPEG2 standard prescribes three picture types, that is, an in-frame encoded image (intra-encoded image) called an I (Intra)-picture, an inter-frame forward predictive encoded image called a P (Predictive)-picture, and a bidirectionally predictive encoded image called a B (Bidirectionally predictive)-picture, so as to sequentially assign any of the I-picture, P-picture and B-picture to frame image data constituting motion picture data in a predetermined order and then perform compression encoding.
Actually, the MPEG2 standard prescribes four types of predictive modes, that is, in-frame encoding, forward predictive encoding, backward predictive encoding and bidirectionally predictive encoding, where it is prescribed that a frame image to which the I-picture is assigned is compression-encoded by the in-frame encoding on a unit of a macro-block of 16 pixels×16 lines basis, a frame image to which the P-picture is assigned is compression-encoded by one of the in-frame encoding or the forward predictive encoding on the macro-block basis, and furthermore, a frame image to which the B-picture is assigned is compression-encoded by any one of the in-frame encoding, the forward predictive encoding, the backward predictive encoding and the bidirectionally predictive encoding on the macro-block basis.
Incidentally, as shown in FIG. 26, a motion picture encoding apparatus 1 to which the MPEG2 standard is applied captures the motion picture data supplied from the outside on the frame image data basis into a frame memory for inputting 2 having recording capacity of a plurality of frames, and sequentially assigns any of the I-picture, P-picture and B-picture to the frame image data captured into the frame memory for inputting 2 in a predetermined order, and also records picture type information representing the I-picture, P-picture and B-picture by associating it with the frame image data in the frame memory for inputting 2.
An operator 3 sequentially reads the frame image data to which the I-picture has been assigned in the frame memory for inputting 2 (hereafter, referred to as first frame image data) as data in a unit of the macro-block (hereafter, referred to as first macro-block data).
Every time the operator 3 reads the first macro-block data from the frame memory for inputting 2, a motion vector detector 4 reads the picture type information (that is, representing the I-picture) corresponding to the first macro-block data, and generates predictive mode data representing that the first macro-block data is compression-encoded by the in-frame encoding, based on that picture type information, and then sends it to a motion compensator 5 and a variable length coder 6.
The motion compensator 5 thereby stops a motion compensation process for the corresponding first macro-block data based on the predictive mode data (representing the in-frame encoding) given from the motion vector detector 4.
Accordingly, the operator 3 reads the first macro-block data from the frame memory for inputting 2, and sends the first macro-block data as it is to a discrete cosine transformer 7 since no data is given from the motion compensator 5 at this point.
The discrete cosine transformer 7 performs discrete cosine transformation on the first macro-block data given from the operator 3, and sends the obtained discrete cosine transformation coefficient to a quantizer 8.
The quantizer 8 detects the amount of the encoded data accumulated in a buffer 9 provided on an output stage (hereafter, referred to as the amount of accumulated data) in a predetermined cycle, and selects a quantization step according to the detected amount of accumulated data.
The quantizer 8 thereby quantizes the discrete cosine transformation coefficient given from the discrete cosine transformer 7, based on a corresponding quantization step, and sends the obtained quantization coefficient to the variable length coder 6 and a dequantizer 10 together with the quantization step.
The variable length coder 6 performs the variable length coding (VLC) on the quantization coefficient given from the quantizer 8, with a Huffman code or the like, and also performs the variable length coding on the quantization step given from the quantizer 8 and the predictive mode data given from the motion vector detector 4, and then outputs the obtained encoded data to the outside via the buffer 9.
Thus, the motion picture encoding apparatus 1 sequentially compression-encodes the first frame image data in the frame memory for inputting 2 on the first macro-block data basis by the in-frame encoding, and outputs the obtained encoded data to the outside.
In addition, the dequantizer 10 dequantizes the quantization coefficient given from the quantizer 8, based on the quantization step likewise given from the quantizer 8, and sends the obtained discrete cosine transformation coefficient to an inverse-discrete cosine transformer 11.
The inverse-discrete cosine transformer 11 performs the inverse discrete cosine transformation (IDCT) on the discrete cosine transformation coefficient given from the dequantizer 10, and sends the obtained first macro-block data to an adder 12.
The adder 12, when the first macro-block data is given from the inverse-discrete cosine transformer 11, sends the first macro-block data as it is to a frame memory for reference 13 having recording capacity of a plurality of frames to store it therein, since no data is given from the motion compensator 5 at this point, and thus the first frame image data is reconstructed in the frame memory for reference 13.
On the other hand, the operator 3 sequentially reads the frame image data (hereafter, referred to as second frame image data) to which the P-picture is assigned in the frame memory for inputting 2 as data in a unit of the macro-block (hereafter, referred to as second macro-block data).
In this case, every time the second macro-block data is read from the frame memory for inputting 2 by the operator 3, the motion vector detector 4 reads the same second macro-block data and the picture type information corresponding thereto (that is, representing the P-picture) from the frame memory for inputting 2, and also reads the first or second frame image data on a more forward side (in the past time-wise) than the second macro-block data for reference purposes in forward prediction, based on that picture type information.
And while the motion vector detector 4 sequentially associates the second macro-block data with a plurality of block data for comparison by a block matching method in the first or second frame image data, it calculates a sum of absolute values of differences between the pixel values of the pixels in the second macro-block data and the pixel values of the pixels of the block data for comparison corresponding thereto respectively (hereafter, referred to as a predictive error).
Thus, the motion vector detector 4 selects the predictive error having the smallest value (hereafter, especially referred to as a minimum predictive error) out of the predictive errors sequentially calculated between the second macro-block data and the respectively corresponding block data for comparison, and also detects the block data for comparison which was used when the minimum predictive error is obtained (hereafter, referred to as forward approximate block data), as the best match data with the second macro-block data, and then detects a forward motion vector of the second macro-block data based on the amount of motion between the detected forward approximate block data and the second macro-block data.
In addition, the motion vector detector 4 calculates an average of the pixel values of the pixels in the second macro-block data, and then calculates the sum of absolute values of differences between the calculated average and the pixel values (hereafter, referred to as a distribution value), and then compares the calculated distribution value to the minimum predictive error.
As a result of this, the motion vector detector 4 determines that, if the distribution value is smaller than the minimum predictive error, distribution of the pixels (variation in pixel values) is small as to the second macro-block data, and so the data amount of the encoded data (hereafter, referred to as an encoded data amount) could be comparatively small even if the second macro-block data is compression-encoded as it is, so that it generates the predictive mode data representing that the second macro-block data is compression-encoded by the in-frame encoding, and then sends it to the motion compensator 5 and the variable length coder 6.
As opposed to this, the motion vector detector 4 determines that, if the distribution value is larger than the minimum predictive error, the distribution of the pixels (variation in pixel values) is large as to the second macro-block data, and so the encoded data amount could hardly be rendered small unless the second macro-block data is compression-encoded by the forward predictive encoding, so that it generates the predictive mode data representing that the second macro-block data is compression-encoded by the forward predictive encoding, and then sends it together with the motion vector of the second macro-block data to the motion compensator 5 and the variable length coder 6.
Then, the motion compensator 5 stops the motion compensation process for the second macro-block data when the predictive mode data representing that the in-frame encoding is applied to the second macro-block data is given from the motion vector detector 4.
In addition, when the motion vector to the second macro-block data and the predictive mode data representing the forward predictive encoding are given from the motion vector detector 4, the motion compensator 5 performs the motion compensation process and reads the first or second frame image data on the more forward side (in the past time-wise) than the second macro-block data, for reference purposes from the frame memory for reference 13.
And then, the motion compensator 5 extracts the block data for operation which is the best match with the second macro-block data, from the first or second frame image data based on the motion vector, and then sends it to the operator 3 and the adder 12.
When the in-frame encoding is selected as the predictive mode for the second macro-block data read from the frame memory for inputting 2, the operator 3 sends the second macro-block data as it is to the discrete cosine transformer 7 since no block data for operation is given from the motion compensator 5.
Thus, when the in-frame encoding is selected as the predictive mode for the second macro-block data, the motion picture encoding apparatus 1 has each of the discrete cosine transformer 7, the quantizer 8, the variable length coder 6, the buffer 9, the dequantizer 10, the inverse-discrete cosine transformer 11, the adder 12 and the frame memory for reference 13 operate just as in the case of compression-encoding the above-mentioned first macro-block data.
Thus, the motion picture encoding apparatus 1 performs the variable length coding on the second macro-block data together with the quantization step and the predictive mode data, and then outputs the obtained encoded data to the outside, and also decodes the compressed second macro-block data and stores it in the frame memory for reference 13.
In addition, when the forward predictive encoding is selected as the predictive mode for the second macro-block data read from the frame memory for inputting 2, the operator 3 subtracts the block data for operation given from the motion compensator 5, from the second macro-block data, and then sends the obtained difference data to the discrete cosine transformer 7.
In this case, the discrete cosine transformer 7 performs the discrete cosine transformation on the difference data given from the operator 3, and sends the obtained discrete cosine transformation coefficient to the quantizer 8.
In addition, the quantizer 8 quantizes the discrete cosine transformation coefficient based on the corresponding quantization step selected just as in the above-mentioned case of processing the first macro-block data, and sends the obtained quantization coefficient together with the quantization step to the variable length coder 6 and the dequantizer 10.
And then, the variable length coder 6 performs the variable length coding on that quantization coefficient with the Huffman code or the like, and also performs the variable length coding on the corresponding quantization step, the predictive mode data (representing the forward predictive encoding) and the motion vector, and then outputs the encoded data thus obtained to the outside via the buffer 9.
At this point, the dequantizer 10 dequantizes the quantization coefficient given from the quantizer 8, based on the quantization step given likewise from the quantizer 8, and sends the obtained discrete cosine transformation coefficient to the inverse-discrete cosine transformer 11.
In addition, the inverse-discrete cosine transformer 11 performs the inverse-discrete cosine transformation on the discrete cosine transformation coefficient given from the dequantizer 10, and sends the obtained difference data to the adder 12.
The adder 12 adds the difference data given from the inverse-discrete cosine transformer 11 and the block data for operation given from the motion compensator 5 at this point, and sends the obtained second macro-block data to the frame memory for reference 13 to store it therein.
Thus, the motion picture encoding apparatus 1 also reconstructs the second frame image data in the frame memory for reference 13 when sequentially compression-encoding the second frame image data on the second macro-block data basis.
In addition to it, as for the frame image data to which the B-picture is assigned in the frame memory for inputting 2 (hereafter, referred to as third frame image data), the operator 3 sequentially reads it as the data in a unit of the macro-block (hereafter, referred to as third macro-block data).
In this case, every time the third macro-block data is read from the frame memory for inputting 2 by the operator 3, the motion vector detector 4 reads the same third macro-block data and the picture type information corresponding thereto (that is, representing the B-picture) from the frame memory for inputting 2, and also reads the first or second frame image data on the more forward side (in the past time-wise) and the first or second frame image data on the more backward side (in the future time-wise) than the third macro-block data for reference purposes in the forward prediction, backward prediction and bidirectional prediction, based on that picture type information.
And the motion vector detector 4 detects the forward approximate block data having the minimum predictive error (hereinafter, especially referred to as the forward minimum predictive error) in the first or second frame image data on the forward side by the block matching method and thereby detects the forward motion vector to the third macro-block data, as with the above-mentioned second macro-block data.
Likewise, the motion vector detector 4 detects the block data for comparison (hereinafter, referred to as backward approximate block data) having the minimum predictive error (hereinafter, especially referred to as backward minimum predictive error) in the first or second frame image data on the backward side by the block matching method and then detects a backward motion vector to the third macro-block data.
Furthermore, the motion vector detector 4 generates average approximate block data by averaging the forward approximate block data and backward approximate block data thus detected, so as to then calculate the predictive error between the generated average approximate block data and the third macro-block data (hereafter, referred to as bidirectional predictive error).
Thus, the motion vector detector 4 selects one forward minimum predictive error, backward minimum predictive error or bidirectional predictive error which has the smallest value, out of the forward minimum predictive error, the backward minimum predictive error and the bidirectional predictive error, and also calculates the distribution value as to the third macro-block data, as with the above-mentioned second macro-block data, and then compares the calculated distribution value to the selected one forward minimum predictive error, backward minimum predictive error or bidirectional predictive error (hereafter, especially referred to as selected predictive error).
As a result of this, the motion vector detector 4 determines that, if the distribution value is smaller than the selected predictive error, distribution of the pixels (variation) is small as to the third macro-block data, and so the encoded data amount could be relatively small even if the third macro-block data is compression-encoded as it is, so that it generates the predictive mode data representing that the third macro-block data is compression-encoded by the in-frame encoding, and then sends it to the motion compensator 5 and the variable length coder 6.
As opposed to this, the motion vector detector 4 determines that, if the distribution value is larger than the selected predictive error, the distribution of the pixels (variation) is large as to the third macro-block data, and so the encoded data amount could hardly be rendered small unless the third macro-block data is compression-encoded by a predictive mode other than the in-frame encoding.
In this case, when the selected predictive error is the forward minimum predictive error, the motion vector detector 4 generates predictive mode data representing that the third macro-block data is compression-encoded by the forward predictive encoding, and then sends it together with the forward motion vector of the third macro-block data to the motion compensator 5 and the variable length coder 6.
In addition, when the selected predictive error is the backward minimum predictive error, the motion vector detector 4 generates predictive mode data representing that the third macro-block data is compression-encoded by the backward predictive encoding, and then sends it together with the backward motion vector of the third macro-block data to the motion compensator 5 and the variable length coder 6.
Furthermore, when the selected predictive error is the bidirectional predictive error, the motion vector detector 4 generates predictive mode data representing that the third macro-block data is compression-encoded by the bidirectional predictive encoding, and then sends it together with both the forward and backward motion vectors of the third macro-block data to the motion compensator 5 and the variable length coder 6.
The motion compensator 5 stops the motion compensation process for the third macro-block data when the predictive mode data representing that the in-frame encoding is applied to the third macro-block data is given from the motion vector detector 4.
In addition, when the forward motion vector to the third macro-block data and the predictive mode data representing the forward predictive encoding are given from the motion vector detector 4, the motion compensator 5 performs the motion compensation process and reads the first or second frame image data on the more forward side (in the past time-wise) than the third macro-block data, for reference purposes from the frame memory for reference 13, and extracts the block data for operation which is the best match with the third macro-block data, from the read first or second frame image data, based on the forward motion vector, and then sends it to the operator 3 and the adder 12.
Furthermore, when the backward motion vector to the third macro-block data and the predictive mode data representing the backward predictive encoding are given from the motion vector detector 4, the motion compensator 5 also performs the motion compensation process and reads the first or second frame image data on the more backward side (in the future time-wise) than the third macro-block data, for reference purposes from the frame memory for reference 13, and extracts the block data for operation which is the best match with the third macro-block data, from the read first or second frame image data based on the backward motion vector, and then sends it to the operator 3 and the adder 12.
In addition to this, when both the forward and backward motion vectors to the third macro-block data and the predictive mode data representing the bidirectional predictive encoding are given from the motion vector detector 4, the motion compensator 5 also performs the motion compensation process and reads the first or second frame image data on the more forward side (in the past time-wise) and the first or second frame image data on the more backward side (in the future time-wise) than the third macro-block data, for reference purposes from the frame memory for reference 13.
And then, the motion compensator 5 extracts the block data for operation which is the best match with the third macro-block data, from the first or second frame image data on the forward side, based on the forward motion vector and also extracts the block data for operation which is the best match with the third macro-block data, from the first or second frame image data on the backward side, based on the backward motion vector, and then generates the average block data for operation by averaging the extracted two pieces of block data for operation, and sends it to the operator 3 and the adder 12.
When the in-frame encoding is selected as the predictive mode for the third macro-block data read from the frame memory for inputting 2, the operator 3 sends the third macro-block data as it is to the discrete cosine transformer 7 since no data is given from the motion compensator 5.
Thus, when the in-frame encoding is selected as the predictive mode for the third macro-block data, the motion picture encoding apparatus 1 has each of the discrete cosine transformer 7, the quantizer 8, the variable length coder 6, the buffer 9, the dequantizer 10, the inverse-discrete cosine transformer 11, the adder 12 and the frame memory for reference 13 operate, just as when the above-mentioned first macro-block data is compression-encoded, and thus performs the variable length coding on the third macro-block data together with the quantization step and the predictive mode data, and then outputs the obtained encoded data to the outside, and also decodes the compressed third macro-block data and store it in the frame memory for reference 13.
In addition, when the forward predictive encoding, the backward predictive encoding and the bidirectional predictive encoding are selected as the predictive modes for the third macro-block data read from the frame memory for inputting 2, the operator 3 subtracts the block data for operation or the average block data for operation given from the motion compensator 5, from the third macro-block data, and then sends the obtained difference data to the discrete cosine transformer 7.
In this case, the discrete cosine transformer 7 performs the discrete cosine transformation on the difference data given from the operator 3, and sends the obtained discrete cosine transformation coefficient to the quantizer 8.
The quantizer 8 quantizes the discrete cosine transformation coefficient based on the corresponding quantization step selected just as in the above-mentioned case of processing the first macro-block data, and sends the obtained quantization coefficient together with the quantization step to the variable length coder 6 and the dequantizer 10.
And when the forward predictive encoding is selected as the predictive mode of the third macro-block data which is to be a basis of the quantization coefficient, the variable length coder 6 performs the variable length coding on that quantization coefficient with the Huffman code or the like, and also performs the variable length coding on the corresponding quantization step, the predictive mode data (representing the forward predictive encoding) and the forward motion vector, and then outputs the encoded data thus obtained to the outside via the buffer 9.
In addition, when the backward predictive encoding is selected as the predictive mode for the third macro-block data which is to be the basis of the quantization coefficient, the variable length coder 6 performs the variable length coding on the quantization coefficient with the Huffman code or the like, and also performs the variable length coding on the corresponding quantization step, the predictive mode data (representing the backward predictive encoding) and the backward motion vector, and then outputs the encoded data thus obtained to the outside via the buffer 9.
Furthermore, when the bidirectional predictive encoding is selected as the predictive mode for the third macro-block data which is to be the basis of the quantization coefficient, the variable length coder 6 performs the variable length coding on the quantization coefficient with the Huffman code or the like, and also performs the variable length coding on the corresponding quantization step, the predictive mode data (representing the bidirectional predictive encoding) and both the forward and backward motion vectors, and then outputs the encoded data thus obtained to the outside via the buffer 9.
At this time, the dequantizer 10 dequantizes the quantization coefficient given from the quantizer 8, based on the quantization step given likewise from the quantizer 8, and sends the obtained discrete cosine transformation coefficient to the inverse-discrete cosine transformer 11.
In addition, the inverse-discrete cosine transformer 11 performs the inverse-discrete cosine transformation on the discrete cosine transformation coefficient given from the dequantizer 10, and sends the obtained difference data to the adder 12.
Then, the adder 12 adds the difference data given from the inverse-discrete cosine transformer 11 and the block data for operation or the average block data for operation given from the motion compensator 5 at this point, and sends the obtained third macro-block data to the frame memory for reference 13 to store it therein.
Thus, the motion picture encoding apparatus 1 also reconstructs the third frame image data in the frame memory for reference 13 when sequentially compression-encoding the third frame image data on the third macro-block data basis.
Thus, the motion picture encoding apparatus 1 sequentially compression-encodes the motion picture data on the frame image data basis by repeating the order of the I-picture, the P-picture, and the B-picture located between the I-picture and P-picture or between two P-pictures, and then outputs the obtained encoded data to the outside.
Incidentally, as for such distribution of motion picture data by using the motion picture encoding apparatus 1, the motion picture data is compression-encoded by the motion picture encoding apparatus 1 at relatively high compressibility in compliance with the data transfer rate of the Internet, and so the image quality (a degree representing whether or not there is noise) of the motion picture provided to a user deteriorates, so that a request for making the image quality for the distributed motion picture higher is increasingly voiced.
Thus, as for such distribution of the motion picture data, there is such a proposed method that the frame image data is previously excluded from the motion picture data to be provided to the motion picture encoding apparatus 1 at predetermined intervals to change the frame rate (that is, the number of frame images in the motion picture per unit time) and then is compression-encoded.
According to this method, it is considered that, as the number of pieces of the frame image data to be compression-encoded per unit time is reduced by lowering the frame rate of the motion picture data, the remaining frame image data can be sequentially compression-encoded at relatively low compressibility and thus the image quality of the motion picture provided to the user can be made higher.
According to this method, however, the frame image data is merely excluded from the motion picture data at the predetermined intervals, irrespective of change in the picture of the motion picture, which has a problem that, if the picture of the motion picture remarkably changes between the frame image data remaining after the exclusion, the compressibility changes accordingly and so the image quality of the motion picture provided to the user consequently changes.
Further, in such distribution of motion picture data, such method has been proposed that the motion picture encoding apparatus 1 sequentially traces and extracts the data of an image in an arbitrary shape from successive frame image data of the motion picture data and compression-encodes the extracted data of the image (hereinafter, referred to as extract image data).
By this method, because successive extract image data are extracted from the motion picture data and so the data amount of data to be compression-encoded can be reduced, the successive extract image data can be sequentially compression-encoded at relatively low compressibility, and as a result, the successive extract image which are a part of the motion picture can be provided to users with making its image quality higher.
In this method, the motion vector of each piece of macro-block data is detected every frame image data, and the extract image data in the arbitrary shape is sequentially traced in the frame image data by using the detected motion vector.
Further, in this method, the motion vector of each piece of macro-block data is detected every frame image data, and the detected motion vector is compression-encoded together with the extract image data sequentially extracted from the frame image data based on the results of tracing the extract image data.
Thus, this method has a problem in that, because the motion vector of the macro-block data is detected for each the tracing and compression encoding of the extract image data, the amount of operation for detecting the motion vector increases and as a result, the compression encoding of the extract image data needs a lot of processing time.