A description will be given of examples of an encoder and a decoder used in a conventional video coding transmission system with reference to FIGS. 1 and 2. FIG. 1 is a block diagram showing a schematic configuration of the encoder used in the conventional video coding transmission system. FIG. 2 is a block diagram showing a schematic configuration of the decoder used in the conventional video coding transmission system.
The encoder 120 shown in FIG. 1 and the decoder 140 shown in FIG. 2 are a digital video encoder and a digital video decoder, respectively, which are compliant with the H.263 coding system described in ITU-T Recommendation H.263 “Video coding for low bit rate communication”.
The encoder 120 reduces temporal redundancy by the motion compensated inter-frame prediction and reduces spatial redundancy by orthogonal transformation (for example, DCT: Discrete Cosine Transform), so as to perform information compressing and encoding for an input video signal 2 as a digital video.
An inputting section 121 receives the input video signal 2, namely, a time sequence of frame images. Herein, in the encoder 120, a frame image which is now being encoded is referred to as a “current frame”.
The inputting section 121 divides the “current frame” into square regions (“macro blocks (first image blocks)”) of 16×16 pixels and sequentially sends the “macro blocks” to a motion estimating section 122 and a subtracting section 124. Herein, a “macro block” which is now being encoded is referred to as a “current macro block”.
The motion estimating section 122 estimates “motion vectors” and determines “macro block modes (described later)” on a macro block basis.
The motion estimating section 122 finds out a portion (“motion prediction data”) similar to the current macro block in a predetermined search area of a frame image (referred to as a “reference frame”) which was encoded in past and stored in a frame memory 132, and estimates an amount of two-dimensional spatial motion from the current macro block to “motion prediction data” as a “motion vector”.
For example, the motion estimating section 122 can perform the aforementioned estimation of “motion vectors” using a “block matching”. Specifically, the motion estimating section 122 sets the search area around a spatial position of the current macro block in the “reference frame” within the frame memory 132, and calculates the “sum of squares of differences” or the “sum of absolute differences” between image data within the search area and the current macro block. The motion estimating section 122 then obtains image data that minimizes the calculated “sum of squares of differences” or “sum of absolute differences” within the search area, as the “motion prediction data”. And the motion estimating section 122 estimates the amount of two-dimensional spatial motion from the current macro block to the “motion prediction date” as the “motion vector”.
The motion estimating section 122 sends the estimated “motion vector” to a motion compensating section 123 and a variable length encoding section 127.
The motion estimating section 122 determines a “macro block mode” applied to the current macro block. Herein, the “macro block mode” indicates a method (“prediction mode”, the number of motion vectors, eta.) of generating a “predictive residual signal (described later)” for the current macro block. As shown in FIG. 3, the macro block mode includes the “INTRA mode (the intra-frame prediction mode is applied)”, “INTER mode (the inter-frame prediction mode is applied)”, and “INTER 4V mode (the inter-frame prediction mode with four motion vectors is applied)”.
The “prediction mode” indicates application of the “inter-frame prediction mode” which reduces the temporal redundancy or application of the “intra-frame prediction mode” which reduces the spatial redundancy for the current macro block.
Specifically, the motion estimating section 122 selects a “macro block mode” that minimizes the power of the predictive residual signal (described later) among the “INTRA mode”, “INTER mode”, and “INTER 4V mode” based on the estimated “motion vector”.
The motion estimating section 122 sends the determined “macro block mode” to the motion compensating section 123 and the variable length encoding section 127.
The motion compensating section 123 sends “control information” obtained based on the “macro block mode” and the “motion vector” sent from the motion estimating section 122, to the subtracting section 124.
For example, when the “macro block mode” sent from the motion estimating section 122 is the “INTRA mode”, the motion compensating section 123 notifies the subtracting section 124 of only the received “macro block mode (INTRA mode)” as the “control information” without forming a “predicted image block (described later)”, namely, without performing the motion compensated inter-frame prediction for the “current macro block”.
When the “macro block mode” received from the motion estimating section 122 is the “INTER mode” or “INTER 4V mode”, the motion compensating section 123 performs the motion compensated inter-frame prediction for the current macro block using the “motion vector” sent from the motion estimating section 122 and the reference frame stored in the frame memory 132, so as to form the “predicted image block”.
Herein, in the “INTER mode”, one motion vector is assigned to a macro block of 16×16 Pixels. In the “INTER 4V mode”, one motion vector is assigned to a sub-block of 8×8 pixels.
The motion compensating section 123 sends the “predicted image block”, the “macro block mode”, and the “motion vector” to the subtracting section 124 as the “control information”. Moreover, the motion compensating section 123 sends the “predicted image block” to an adding section 131.
The subtracting section 124 sends predetermined information to an orthogonal transforming section 125 according to the “control information” sent by the motion compensating section 123.
Specifically, when the “macro block mode” is the “INTER mode” or “INTER 4V mode”, the subtracting section 124 reduces the temporal redundancy between temporally consecutive macro blocks, by obtaining a difference between the “current macro block” sent from the inputting section 121 and the “predicted image block” sent from the motion compensating section 123.
Herein, the difference obtained by the subtracting section 124 is referred to as the “predictive residual signal”. The subtracting section 124 sends this “predictive residual signal” to the orthogonal transforming section 125.
When the “macro block mode” is the “INTRA mode”, the subtracting section 124 sends the “current macro block” sent from the inputting section 124, to the orthogonal transforming section 125, because the “predictive residual signal” for the “predicted image block” is not sent from the motion compensating section 123.
The orthogonal transforming section 125 reduces the spatial redundancy within the “predictive residual signal”, by performing orthogonal transformation (for example. DCT) in sub-blocks of 8×8 pixels for the “predictive residual signal” sent from the subtracting section 124.
The orthogonal transforming section 125 sends “orthogonal transformation coefficients (for example, DCT coefficients)” obtained by the orthogonal transformation, to a quantizing section 126.
The quantizing section 126 quantizes the “orthogonal transformation coefficients” sent from the orthogonal transforming section 125. The quantizing section 126 then sends the “quantized orthogonal transformation coefficients” obtained by the quantization, to the variable length encoding section 127 and a dequantizing section 129.
The variable length encoding section 127 performs variable length encoding for the “quantized orthogonal transformation coefficients” sent by the quantizing section 126, and the “motion vector” and “macro block mode” sent from the motion estimating section 122, and multiplexes the same with a compressed bit stream 3. The variable length encoding section 127 sands the compressed bit stream 3 to an outputting section 128.
The outputting section 128 transmits the compressed bit stream 3 constituting one or a plurality of frame images sent from the variable length encoding section 127, to a network 1.
The dequantizing section 129 dequantizes the “quantized orthogonal transformation coefficients” sent by the quantizing section 126, and sends the obtained “orthogonal transformation coefficients” to an inverse orthogonal transforming section 130.
The inverse orthogonal transforming section 130 performs inverse orthogonal transformation (for example, inverse DCT) for the “orthogonal transformation coefficients” sent by the dequantizing section 129, and sends the “predictive residual signal” obtained by the inverse orthogonal transformation, to the adding section 131.
The adding section 131 sends the result of adding up the “predicted image block” sent by the motion compensating section 123 and the “predictive residual signal” sent by the inverse orthogonal transforming section 130, to the frame memory 132.
When the “INTRA mode” is selected as the “macro block mode”, the adding section 131 sends the “predictive residual signal sent by the inverse orthogonal transforming section 130 (the current macro block sent by the inputting section 121)” to the frame memory 132, because the “predicted image block” is not generated by the motion compensating section 123 (motion compensated inter-frame prediction is not performed).
The frame memory 132 constructs and stores the “reference frame” based on the information sent by the adding section 131 (the current macro block). The frame memory 132 sends the “reference frame” to the motion estimating section 122 and the motion compensating section 123.
The decoder 140 shown in FIG. 2 reproduces an output video signal 4 from the compressed bit stream 3 sent by the encoder 120.
An inputting section 141 receives the compressed bit stream 3, and sends the same to a variable length decoding section 142.
The variable length decoding section 142 decodes the “quantized orthogonal transformation coefficients”, the “motion vector”, and the “macro block mode” for each macro block starting from the head of each frame image in the compressed bit stream 3 sent by the inputting section 141.
The variable length decoding section 142 sends the decoded “quantized orthogonal transformation coefficients” to a dequantizing section 143. When the “macro block mode” is the “INTER mode” or “INTER 4V mode”, the variable length decoding section 142 sends the one or several decoded “motion vectors” and “macro block mode” to a motion compensating section 145.
The dequantizing section 143 dequantizes the “quantized orthogonal transformation coefficients” sent by the variable length decoding section 142, so as to obtain the “orthogonal transformation coefficients” and to send the obtained “orthogonal transformation coefficients” to an inverse orthogonal transforming section 144.
The inverse orthogonal transforming section 144 performs inverse orthogonal transformation for the “orthogonal transformation coefficients” sent by the dequantizing section 143, so as to obtain the “predictive residual signal” and to send the obtained “predictive residual signal” to the adding section 146.
The motion compensating section 145 generates the “predicted image block” based on the reference frame stored in a frame memory 147 and the “motion vector” and “macro block mode” sent by the variable length decoding section 142, and sends the generated “predicted image block” to an adding section 146.
The adding section 146 adds up the “predictive residual signal” sent by the inverse orthogonal transforming section 144 and the “predicted image block” sent by the motion compensating section 145, so as to generate a macro block constituting the output video signal 4 and to send the generated macro block to an outputting section 148.
However, when the “macro block mode” is the “INTRA mode”, the adding section 146 sends the “predictive residual signal” sent by the inverse orthogonal transforming section 144 to the outputting section 148 as the macro block constituting the output video signal 4, because the “predicted image block” is not sent by the motion compensating section 145.
The frame memory 147 constructs and stores the “reference frame” based on the information sent by the adding section 146 (the macro block). The frame memory 147 sends the “reference frame” to the motion compensating section 145.
The outputting section 148 constructs the output video signal 4 based on the information sent by the adding section 146 (the macro block), and outputs the output video signal 4 to a display device (not shown) at a predetermined timing of display.
As described above, in the conventional video coding transmission system, the “macro block mode” is determined for each macro block and the “coding information (motion vector, quantization parameter, eta.) set for each macro block is shared for coding process, thus increasing the coding efficiency.
However, the conventional video coding transmission system cannot set a plurality of the “macro block mode” in one macro block. Accordingly, there was a problem in that efficient coding cannot be performed when one macro block includes a portion (bird portion) which should be subjected to coding with the “intra-frame prediction mode” and a portion (cloud portion) which should be subjected to coding in the “inter-frame prediction mode”, as shown in FIG. 4.
In order to solve this problem, there is a method in which the macro blocks are made smaller and the unit for switching the selection between the “macro block modes” is reduced. However, this method increases the number of macro blocks and then increases the number of transmissions of the coding information of each macro block necessary for coding, which causes a problem that reduces the coding efficiency.
Therefore, the present invention was made in the light of the aforementioned problems, and an object thereof is to switch the “macro block modes”, so as to allow the portion which is subjected to coding with the “intra-frame prediction mode” and the portion which is subjected to coding with the “inter-frame prediction mode” to be mixed in one macro block without changing the size and framework of macro blocks.