1. Field of the Invention
The present invention relates to a motion picture encoding device and associated motion picture encoding processing program which compresses motion images in view of human visual characteristics.
2. Description of the Related Art
In applications, such as digital television broadcasting, Internet video streaming, DVD, etc., a coding technique with a high compression ratio is required because transmission bandwidth and storage capacity are limited. For instance, an H.264 standard is known as a high compression coding technique which meets such a requirement. Hereinafter, an example of a motion picture encoding device based on the H.264 standard will be described with reference to FIG. 9˜FIG. 15.
FIG. 9 is a block diagram showing an outline configuration of a motion picture encoding device. A subtracter 10 generates a prediction error signal indicating luminance difference by subtracting the prediction block pixel value from the current block pixel value. A quantization/transform section 11 applies integer DCT (Discrete Cosine Transform) to the prediction error signal outputted from the subtracter 10 and a transform coefficient is obtained. Furthermore, this transform coefficient is quantized with a predetermined quantization width and coefficient data is generated. An entropy encoder section 12 performs entropy encoding of the coefficient data generated by the quantization/transform section 11 with Exponential-Golomb codes based on Variable Length Codes (VLC) and applies CABAC (Context-based Adaptive Binary Arithmetic Coding).
An inverse quantization/inverse transform section 13, an adder 14, a loop filter 15 and a frame memory 16 form a local decoding portion. The local decoding portion applies inverse quantization and inverse integer DCT to the coefficient data generated in the quantization/transform section 11, adds the previous prediction block pixel value and generates a decoded image. Further, after the local decoding portion reduces block noise by performing loop filtering to the generated decoded image, it is temporarily stored in the frame memory 16. An intra-frame prediction section 17 calculates the intra-frame prediction block value using the decoded image read out from the frame memory 16.
A motion detection section 18 detects the motion vector of the current block. A motion compensation section 19 calculates the inter-frame prediction block value by performing motion compensation to the reference frame (decoded image read out from the frame memory 16) corresponding to the motion vector detected by the motion detection section 18. A selector 20 selects either the intra-frame prediction block value calculated by the intra-frame prediction section 17 or the inter-frame prediction block value calculated by the motion compensation section 19 corresponding to instructions of a determination section 21 and provides the respective value to the subtracter 10. The determination section 21 estimates the amount of coded data at the time of intra-frame predictive coding as well as the amount of coded data of inter-frame predictive coding and directs the selector 20 to select the coding mode with the smaller amount of coded data.
Next, the operations of the encoding determination processing of the motion picture encoding device according to the above-mentioned configuration will be explained with reference to FIGS. 10 through 15. In the following, after first explaining the operations of the “encoding determination processing”, the separate operations of the “inter-prediction processing”, the “intra-prediction processing” and the “inter D&Q processing” which encompass the encoding determination processing will be outlined.
(1) Operations of the Encoding Determination Processing
FIG. 10 is a flow chart showing operations of the encoding determination processing executed for each input macroblock. This processing is initiated by inputting a 16×16 macroblock image (hereinafter, denoted as “input macroblock”). In Step SF1, each section of the device is initialized. Secondly, in Step SF2, the amount of coded data at the time of inter-frame predictive coding is estimated by the execution of inter-prediction processing. Next, in Step SF3, the amount of coded data for intra-frame predictive coding is estimated and further intra-frame coding (integer DCT, quantization, inverse quantization and inverse integer DCT) is performed by the execution of intra-prediction processing. Then, in Step SF4, the minimum Sum of Absolute Differences SADinter (this is equivalent to the amount of coded data at inter-prediction) obtained in the above-mentioned Step SF2 is judged as to whether or not greater than the minimum Sum of Absolute Differences SADintra (this is equivalent to the amount of coded data at intra-prediction) obtained in the above-mentioned Step SF3.
When the amount of coded data at inter-prediction is greater than the amount of coded data at intra-prediction, the judgment result becomes “YES” and completes this processing. Accordingly, in this case the motion picture encoding device performs video compression with the intra-frame coding executed in the above-mentioned Step SF3.
Conversely, when the amount of coded data at the time of inter-prediction is less than the amount of coded data at the time of intra-prediction, the judgment result of the above-mentioned Step SF4 becomes “NO” and the flow advances to Step SF5, which executes inter D&Q processing to perform inter-frame coding (integer DCT, quantization, inverse quantization and inverse integer DCT). Accordingly, in this case the motion picture encoding device performs video compression with inter-frame coding.
(2) Operations of the Inter-Prediction Processing
Next, the operations of the inter-prediction processing will be explained with reference to FIGS. 11 through 13. When processing has been executed via the above-mentioned Step SF2 (refer to FIG. 10), the flow advances to Step SG1 shown in FIG. 11 and judges whether or not processing has been completed for all of the 4×4 pixel blocks of a 16×16 pixel macroblock partitioned into 16 sub-macroblocks. When processing for all the blocks has been completed, the judgment result becomes “YES” and this processing is completed. Otherwise, the judgment result becomes “NO” and the flow advances to Step SG2.
Additionally, with regard to the association between a 16×16 pixel macroblock and the 4×4 pixel blocks, as illustrated in FIG. 13, the 4×4 pixel blocks within the macroblock are denoted by the block number n.
Next, in Step SG2, a 4×4 pixel block processing object (hereinafter, denoted as the “current block”) pixel value Org is calculated. Subsequently, in Step SG3, MV search processing is executed which searches for a motion vector (MV). In the MV search processing, as shown in Steps SH1˜SH3 of FIG. 12, correlation of the current block is calculated while shifting pixels in the center of the reference block within a search region of the reference frame. Amongst those pixels, the pixel position with the highest correlation (similarity) is extracted as the best motion vector.
MV search processing estimates the correlation of the reference block and the current block with the Sum of Absolute Differences SAD between both blocks. Accordingly, when the pixel position of the highest correlation is extracted as the motion vector, the Sum of Absolute Differences SAD represents the minimum. The minimum Sum of Absolute Differences SADinter is used for Step SF4 (refer to FIG. 10) for judging whether or not to execute inter D&Q processing.
(3) Operations of the Intra-Prediction Processing
Next, the operations of the intra-prediction processing will be explained with reference to FIG. 14. When inter-prediction processing has been executed via Step SF3 mentioned above (refer to FIG. 10), the flow advances to Step SJ1 shown in FIG. 14. In Step SJ1, whether or not processing has been completed for all 4×4 pixel blocks of a 16×16 pixel macroblock partitioned into 16 sub-macroblocks is judged. When processing for all the blocks has been completed, the judgment result becomes “YES” and this processing is completed. Otherwise, the judgment result becomes “NO” and the flow advances to Step SJ2. In Step SJ2, for example in intra 4×4 mode, a prediction block value for each mode amongst a total of nine optional prediction modes is calculated, which are referred to as mode0˜mode8.
Subsequently, in Steps SJ3˜SJ5, the current block Sum of Absolute Differences SAD and the prediction block value for each of the above-mentioned modes is calculated, respectively, and the minimum Sum of Absolute Differences SADintra is obtained from within these results. The minimum Sum of Absolute Differences SADintra is used for Step SF4 (refer to FIG. 10) for judging whether or not to execute inter D&Q processing.
When the minimum Sum of Absolute Differences SAD has been determined, the judgment result in Step SJ3 becomes “YES” and the flow advances to Step SJ6. In Step SJ6, intra-frame coding of the current block is performed using the mode which produces the minimum Sum of Absolute Differences SADintra. Hereinafter, the above-mentioned Steps SJ1˜SJ6 are repeated until processing for all blocks is completed.
(4) Operations of the Inter D&Q Processing
Next, the operations of the inter D&Q processing will be explained with reference to FIG. 15. When processing has been executed via Step SF5 mentioned above (refer to FIG. 10), the flow advances to Step SK1 shown in FIG. 15 and whether or not processing has been completed for all of the 4×4 pixel blocks of a 16×16 pixel macroblock partitioned into 16 sub-macroblocks is judged. If processing has not been completed for all the blocks, the judgment result becomes “NO” and the flow advances to Step SK2.
In Steps SK2˜SK6, the inter-frame prediction block value ref (i,j) is subtracted from the current block pixel value Org (i,j) and proceeds to execute transform processing (integer DCT), quantization processing Q, inverse quantization processing Q-1 and inverse transform processing (inverse integer DCT-1) to the produced prediction error signal. Then, the flow advances to Step SK7. Because of the stepping performed in the inter-frame prediction block value ref (i,j), processing will revert to the above-mentioned Step SK1. Hereafter, the above-mentioned Steps SK1˜SK7 are repeated until processing for all the blocks is completed.
As described above, the motion picture encoding device performs video compression by selectively using either intra-frame coding by correlation in a spatial domain or inter-frame coding by correlation in a temporal domain. A magnitude comparison is executed for each input macroblock between the minimum Sum of Absolute Differences SADinter that is equivalent to the amount of coded data at the time of inter-prediction and the minimum Sum of Absolute Differences SADintra that is equivalent to the amount of coded data at the time of intra-prediction. The coding mode with the smaller Sum of Absolute Differences SAD is selected and compression encoding is performed.
Apart from that, in order to obtain the minimum Sum of Absolute Differences SADintra equivalent to the amount of coded data at the time of intra-prediction, for example, when intra 4×4 mode is selected, it is necessary to calculate the prediction block value for each mode amongst a total of nine optional prediction modes, which are referred to as mode0˜mode8. This situation causes an exponential increase in the calculation amount. Furthermore, the coding technique by the H.264 standard has been described in the conventional prior art, for example, as disclosed by Iain E. G. Richardson “H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia”, Publisher: John Wiley & Sons, Ltd. (December 2003, First Edition) “6.H.264/MPEG-4 Part 10” (p159˜p223) (ISBN: 0470869607).
There are the following issues concerning motion picture encoding devices which perform video compression by selectively using intra-frame coding and inter-frame coding mentioned above.
(a) In many cases intra-frame coding is not selected, except in a case where there is a noticeable difference between an inputted image as compared with a reference image, for example, when a scene changes, when a motion vector has not been detected properly or when there is a substantial luminance variation due to a camera flash, etc. For this reason, there is a problem in that the complexity and number of calculations required for intra-prediction processing (refer to FIG. 14) is simply excessive.
(b) Because a coding determination is performed which uniquely selects either intra-frame coding/inter-frame coding only by a magnitude comparison of the amount of coded data, human visual characteristics are not taken into consideration. This is commonly referred to as the Human Visual System (HVS), the system by which a human eye and brain perceive and interpret visual images. Therefore, there is the possibility of inviting significantly reduced image quality that can easily develop into an image with noticeable noise (luma and chroma noise), such as in the case of quantization error propagation being generated at the time of a motion vector search whereby a fast search algorithm may become ‘trapped’ in a local minimum giving a suboptimal result or during coding at a low bit rate whereby distortion increases the respective quantization value. A local minimum is the concept of a minimum value in a defined section (area) of a block. For instance, this may not be the true global minimum value and is commonly referred to as a false minimum value (false minima).
Consequently, the present invention has been made in view of the above-described circumstances with the purpose of providing a motion picture encoding device and associated motion picture encoding processing program, wherein wasteful calculations will not be performed and coding modes can be determined at a faster speed in view of human visual characteristics.
Furthermore, the present invention aims at providing a motion picture encoding device and motion picture encoding processing program which can rapidly stop quantization error propagation when ‘trapped’ in a local minimum as well as avoid reduced image quality.