1. Field of the Invention
The present invention relates to a coding apparatus and a coding method for applying coding to an image signal for output, and a decoding apparatus and a decoding method for decoding a coded image signal for output.
2. Description of the Related Art
H.264/MPEG4 AVC (hereinafter, referred to as AVC) has been standardized as a system that realizes a coding efficiency nearly twice that of image coding schemes of the related art such as MPEG2 and MPEG4. The AVC standard is similar in processing to the above-mentioned image coding schemes of the related art in that an image signal is coded by using an orthogonal transform process and a motion compensation process. However, as opposed to image coding schemes of the related art, the AVC standard realizes a relatively high coding efficiency due to the high degree of freedom in terms of coding tools used when coding each element constituting a coding process (see Japanese Unexamined Patent Application Publication No. 2006-94081).
A coding process conforming to the above AVC standard is realized by, for example, a coding apparatus shown in FIG. 20.
A coding apparatus 8 includes a subtraction section 81, an orthogonal transform section 82, a quantization section 83, an inverse quantization section 84, an inverse orthogonal transform section 85, an addition section 86, a pixel storage memory 87, an ME/MD processing section 88, a parameter storage memory 89, an intra prediction section 90, an inter prediction section 91, and a coding section 92.
The subtraction section 81 subtracts a predicted pixel generated by the intra prediction section 90 or the inter prediction section 91 described later from an inputted pixel, and supplies the resulting difference pixel to the orthogonal transform section 82.
The orthogonal transform section 82 applies an orthogonal transform to the difference pixel supplied from the subtraction section 81, in units of a macroblock made up of a plurality of pixels. Then, the orthogonal transform section 82 supplies orthogonal transform coefficients within the orthogonal transformed macroblock to the quantization section 83.
The quantization section 83 quantizes, in accordance with quantization parameters, the orthogonal transform coefficients within each macroblock supplied from the orthogonal transform section 82, and supplies the quantized orthogonal transform coefficients to the coding section 92. Then, the coding section 92 outputs a bit stream obtained by applying variable-length coding or the like to the quantized orthogonal transform coefficients supplied from the quantization section 83. The AVC standard specifies that coding be carried out through variable-length coding such as Context-based Adaptive Variable-Length Coding (CAVLC) or Context-based Adaptive Binary Arithmetic Coding (CABAC).
The quantization section 83 supplies the quantized orthogonal transform coefficients to the coding section 92, and also to the inverse quantization section 84, the ME/MD processing section 88, and the parameter storage memory 89.
The inverse quantization section 84 inverse quantizes the quantized orthogonal transform coefficients in accordance with the quantization parameters used in the quantization section 83, and supplies the inverse quantized orthogonal transform coefficients to the inverse orthogonal transform section 85.
The inverse orthogonal transform section 85 converts each of the orthogonal transform coefficients within each macroblock supplied from the inverse quantization section 84, into a difference pixel constituting this macroblock, and supplies the difference pixel to the addition section 86.
The addition section 86 adds the difference pixel supplied from the inverse orthogonal transform section 85 and a predicted pixel supplied from the intra prediction section 90 or the inter prediction section 91 described later together to thereby generate a reference pixel, and supplies the reference pixel to the pixel storage memory 87.
The pixel storage memory 87 stores the difference pixel supplied from the addition section 86 on a picture-by-picture basis. Let the current picture to be coded be P(N)(N is a natural number), and a picture that was coded M (M is a natural number) pictures prior be P(N-M), the pixel storage memory 87 has already stored reference pixels corresponding to P(N-1), P(N-2), . . . , P(N-M). Since these reference pixels are used in the ME/MD processing section 88 and the inter prediction section 91 described later, the pixel storage memory 87 stores these reference pixels for a fixed period of time.
The ME/MD processing section 88 carries out a motion estimation (ME) process and a mode decision (MD) process with respect to the current macroblock being coded, specifically by referring to coding information stored in the parameter storage memory 89 as described later. Then, in accordance with the results of the ME process and MD process, the ME/MD processing section 88 decides the best prediction mode for the macroblock being currently processed. Further, the ME/MD processing section 88 supplies information necessary for intra prediction to the intra prediction section 90 upon selecting intra prediction as the best prediction mode, and supplies information necessary for inter prediction to the inter prediction section 91 upon selecting inter prediction as the best prediction mode.
The parameter storage memory 89 stores information related to previously coded pictures P(N-1), P(N-2), . . . , P(N-M), which is required for the ME/MD processing section 88 to code the current macroblock to be coded in skip/direct mode described later, from among pieces of information supplied from the quantization section 83.
The intra prediction section 90 generates predicted pixels from already coded pixels within a picture in which the current macroblock to be coded exists, in accordance with the results of decision by the ME/MD processing section 88, and supplies the predicted pixels to the subtraction section 81 and the addition section 86.
The inter prediction section 91 generates predicted pixels by using reference pixels stored in the pixel storage memory 87, in accordance with the results of decision by the ME/MD processing section 88, and supplies the predicted pixels to the subtraction section 81 and the addition section 86.
Next, the specific configuration of the ME/MD processing section 88 and its operation will be described with reference to FIG. 21.
The ME/MD processing section 88 includes an intra search processing section 101 that selects the best mode from among intra prediction modes, a skip/direct search processing section 102 that refers to information stored in the parameter storage memory 89 to obtain motion information in skip/direct prediction mode, an L0/L1/Bi search processing section 103 that obtains the best mode and motion information related to other prediction modes, and an MD processing section 104 that selects the best mode from among the modes selected by the respective search processing sections.
The intra search processing section 101 selects the best mode from among intra prediction modes in accordance with the image signal of an input pixel to be coded, and supplies information related to a coding process corresponding to the selected mode, to the MD processing section 104.
The skip/direct search section 102 includes an MV deriving section 105 that derives motion vectors in spatial direct mode and motion vectors in temporal direct mode, from among skip/direct prediction modes described later, with respect to a macroblock of which the slice type is B slice, and a post-processing section 106 that selects the best prediction mode in accordance with the motion vectors derived by the MV deriving section 105.
The MV deriving section 105 refers to information stored in the parameter storage memory 89 to derive motion vectors in spatial direct mode and motion vectors in temporal direct mode as described later, and supplies the derived motion vectors to the post-processing section 106.
The post-processing section 106 selects the best mode from among skip/direct prediction modes by using the motion vectors derived by the MV deriving section 105, and supplies to the MD processing section 104 information related to a coding process corresponding to the selected mode.
The L0/L1/Bi search processing section 103 selects the best mode from among a prediction (L0 prediction) using pictures located forward of the current picture to be coded in display order, a prediction (L1 prediction) using pictures located backward of the current picture to be coded in display order, and a prediction (Bi prediction) using pictures located both forward and backward of the current picture to be coded in display order. The L0/L1/Bi search processing section 103 then supplies information related to a coding process corresponding to the selected mode, to the MD processing section 104.
In the following, the description will be focused on the processing of the MV deriving section 105, which derives motion vectors in skip/direct prediction mode, of the processing sections included in the ME/MD processing section 88 described above.
The MV deriving section 105 derives motion vectors by different processes between spatial direct mode and temporal direct mode.
First, a motion vector derivation process in spatial direct mode will be described with to a flowchart as shown in FIG. 22.
In step S101, the MV deriving section 105 derives, from coding information of macroblocks adjacent to a macroblock to be coded, reference picture indices refIdxL0/refIdxL1 indicating reference pictures for coding the macroblock to be coded by L0 prediction and L1 prediction, and a flag directZeroPredictionFlag indicating whether or not a motion vector in direct mode is zero, and the processing proceeds to step S102.
In step S102, the MV deriving section 105 derives the motion vector mvCol of an anchor block, and the reference picture index refIdxCol of the anchor block by inputting mbPartIdx indicating the position of a 8×8 block within a macroblock to be coded, and subMbPartIdx indicating the position of a 4×4 pixel block within a sub-macroblock, and proceeds to step S103. Here, an anchor picture refers to a picture with the lowest valued reference picture index in L1 prediction. Further, an anchor block refers to a block of an anchor picture which is at the same spatial position as the block to be coded. Normally, the closest reference picture located backward of the picture to be coded in display order is selected as the anchor picture.
In step S103, the MV deriving section 105 determines whether or not the information related to the anchor picture derived in step S102 satisfies all of the first to third conditions described below.
That is, as the first condition, the MV deriving section 105 determines whether or not the values of mvCol[0] and mvCol[1] respectively indicating the magnitudes of the motion vector of an anchor block in the horizontal and vertical directions are both equal to or less than ±1.
Further, as the second condition, the MV deriving section 105 determines whether or not the reference picture index refIdxCol of a reference picture for an anchor block is 0, that is, whether or not an anchor picture is the closest reference picture located backward of the picture to be coded in display order.
Further, as the third condition, the MV deriving section 105 determines whether or not the picture type RefPicList1[0] of the lowest valued reference picture index of L1 prediction is short-time reference picture.
The MV deriving section 105 determines whether or not all of the first to third conditions described above are satisfied. Then, the MV deriving section 105 proceeds to step S104 if all of the conditions are satisfied, and proceeds to step S105 if not all of the conditions are satisfied.
In step S104, the MV deriving section 105 sets a flag ColZeroFlag to 1 and proceeds to step S106.
In step S105, the MV deriving section 105 sets the flag ColzeroFlag to 0 and proceeds to step 5106.
In step S106, the MV deriving section 105 derives the vector X representing the horizontal direction and vertical direction of the motion vector of the macroblock to be coded, and proceeds to step S107.
In step S107, the MV deriving section 105 makes a condition determination on the basis of the motion information of adjacent macroblocks refIdxL0, refIdxL1, and directZeroPredictionFlag, and the flag ColZeroFlag set in step S106. If a predetermine d condition is satisfied, the MV deriving section 105 proceeds to step S108, and if the predetermined condition is not satisfied, the MV deriving section 105 proceeds to step S109.
In step S108, the MV deriving section 105 sets both the vertical and horizontal values of the motion vector mvLX of the macroblock to be coded in spatial direct mode to 0, and terminates the procedure.
In step S109, the MV deriving section 105 derives the motion vector mvLX of the macroblock to be coded in spatial direct mode by using motion information of adjacent macroblocks, and terminates the procedure.
Further, the procedure for deriving motion information of an anchor block in step S102 is carried out by the MV deriving section 105 in accordance with a flowchart as shown in FIG. 23.
In step S201, the MV deriving section 105 derives information related to an anchor picture, and proceeds to step S202.
In step S202, the MV deriving section 105 derives the following pieces of information related to an anchor block from the information derived in step S202, by referring to information stored in the parameter storage memory 19.
That is, the MV deriving section 105 derives the macroblock address mbAddrCol of the anchor block, the block index mbPartIdxCo indicating a 8×8 pixel block within a macroblock of the anchor block, the block index subMbPartIdxCol indicating a 4×4 pixel block within a sub-macroblock of the anchor block, the flag predFlagL0Col indicating the prediction mode of L0 prediction of the anchor block, the flag predFlagL1Col indicating the prediction mode of L1 prediction of the anchor block, the macroblock type mb_type of the anchor block, the motion vector MvL0[mbPartIdxCol][subMbPartIdxCol] of L0 prediction of the anchor block, the reference picture index RefIdxL0[mbPartIdxCol] of L0 prediction of the anchor block, the motion vector MvL1[mbPartIdxCol][subMbPartIdxCol] of L1 prediction of the anchor block, the reference picture index RefIdxL1[mbPartIdxCol] of L1 prediction of the anchor block, and the ratio vertMvScale between vertical components of the motion vector of the anchor block and motion vector of the macroblock to be coded. Then, the MV deriving section 105 proceeds to step S203.
In step S203, the MV deriving section 105 determines the prediction mode of the anchor block, from the macroblock type of the anchor block derived in step 3202 and its prediction mode, and proceeds to step S204 or step S205 depending on the result of the determination.
In step S204, the MV deriving section 105 sets both the horizontal and vertical components of the motion vector mvCol of the anchor block to 0 and also sets the reference picture index refIdxCol of the anchor block to −1, and then proceeds to step S103.
In step S205, the MV deriving section 105 determines whether or not the value of the flag pregFlagL0Col indicating the prediction mode of L0 prediction of the anchor block is 1. The MV deriving section 105 proceeds to step S206 if the value is 1, and proceeds to step S207 if the value is not 1.
In step S206, the MV deriving section 105 sets the motion vector mvCol of the anchor block to the motion vector MvL0[mbPartIdxCol][subMbPartIdxCol] of L0 prediction of the anchor block, sets the reference picture index refIdxCol of the anchor block to the reference picture index RefIdxL0[mbPartIdxCol] of L0 prediction of the anchor block, and proceeds to step S103.
In step S207, the MV deriving section 105 sets the motion vector mvCol of the anchor block to the motion vector MvL1[mbPartIdxCol][subMbPartIdxCbl] of L1 prediction of the anchor block, sets the reference picture index refIdxCol of the anchor block to the reference picture index RefIdxL1[mbPartIdxCol] of L1 prediction of the anchor block, and proceeds to step S103.
Next, a motion vector derivation process in temporal direct mode will be described with reference to a flowchart shown in FIG. 24.
In step S301, the MV deriving section 105 derives, as information of the anchor block, information of an anchor picture colPic, the macroblock address mbAddrCol of the anchor block, the motion vector mvCol of the anchor block, the reference picture index refIdcCol of the anchor block, and the ratio vertMvScale between vertical components of the motion vector of the anchor block and motion vector of the macroblock to be coded, by inputting mbPartIdx indicating the position of a 8×8 pixel block within the macroblock to be coded, and subPartIdx indicating the position of a 4×4 pixel block within a sub-macroblock. Then, the MV deriving section 105 proceeds to step S302.
In step S302, the MV deriving section 105 derives the reference picture index refIdxL0 of L0 prediction and the reference picture index refIdxL1 of L1 prediction from the information derived in step S301, and proceeds to step S303.
In step S303, the MV deriving section 105 modifies the motion vector mvCol of the anchor block and proceeds to step S304.
In step S304, the MV deriving section 105 derives the current picture to be coded currPic0rField, a picture pic0 with the lowest index of reference pictures of L0 prediction, and a picture pic1 with the lowest index of reference pictures of L1 prediction, and proceeds to step S305. In step S305, the MV deriving section 105 derives the motion vector mvL0 of L0 prediction of the anchor block and the motion vector mvL1 of L1 prediction of the anchor block, from the relationship in display order between currPic0rField, pic0, and pic1 derived in step S304.
In this way, the MV deriving section 105 derives motion vectors in spatial direct mode and temporal direct mode. At this time, to derive these motion vectors, coding information of one of previously coded pictures P(n-m), . . . , P(N-2), P(N-1) is required. Therefore, in the parameter storage memory 89, coding information of pictures needs to be stored for a period of time during which the pictures can serve as reference images.
For example, according to the AVC standard, coding information of a maximum of 32 pictures needs to be stored in the parameter storage memory 89. Therefore, the parameter storage memory 89 needs to temporarily store the amount of data represented by the expression below in a predetermined memory space.(Coding information per each reference macroblock according to direct prediction mode)×(the number of macroblocks forming one picture)×(the maximum number of reference pictures)
Therefore, in a case where the parameter storage memory 89 is provided inside the coding apparatus 8, it is necessary to provide an expensive SRAM inside the apparatus. Further, in a case where the parameter storage memory 89 is provided outside the coding apparatus 8, an external memory with a relatively large capacity is required, which makes it necessary to secure a wide bandwidth for communication of data between the coding apparatus 8 and this external memory.
For example, the size of coding information per each reference macroblock according to direct prediction mode is 141[Byte] when the coding size of each parameter is byte aligned, for example. Hence, to apply coding to an image signal with an image size of 1320×1080, since one picture is made up of 8160 macroblocks, a memory space of 1109760(=141[Byte]×8160)[Byte] or approximately 1.1 [MByte] is required for the parameter storage memory 89. Further, in the case of a configuration where the parameter storage memory 89 is provided outside of the coding apparatus 8, if the frame rate of an image signal to be coded is 30 [fps], a maximum of 528(=1.1[MByte]×8[bit/Byte]×30[fpsix]×2[readwrite]) [Mbps] is required as the bandwidth of bus connection between an external memory and the coding apparatus 8.