1. Field of the Invention
The present invention relates to a digital image signal encoding device, a digital image signal decoding device, a digital image signal encoding method, and a digital image signal decoding method used for an image compression encoding technology or a compressed image data transmission technology.
2. Description of the Related Art
An international standard video encoding system such as MPEG or ITU-TH. 26x (e.g., “Information Technology Coding of Audio-Visual Objects Part 10: Advanced Video Coding”, ISO/IEC 14496-10, 2003: (hereinafter, referred to as Non-Patent Document 1)) has conventionally been premised on use of a standardized input signal format called a 4:2:0 format. The 4:2:0 format is a format where a color moving image signal of RGB or the like is transformed into a luminance component (Y) and two chrominance components (Cb, Cr), and the number of chrominance component samples is reduced to half of luminance components both in horizontal and vertical directions (FIG. 23). The chrominance component is inferior to the luminance component in visibility. Accordingly, the conventional international standard video encoding system has been based on the premise that the amount of original information to be encoded is reduced by downsampling chrominance components before encoding is executed as mentioned above. In video encoding for business purposes such as a broadcast material video, a 4:2:2 format for downsampling Cb and Cr components reduce the number of the components to half of that of luminance components only in a horizontal direction may be used. Thus, color resolution in a vertical direction becomes equal to luminance, thereby increasing color reproducibility compared with the 4:2:0 format. On the other hand, recent increases in resolution and gradation of a video display have been accompanied by studies on a system for performing encoding by maintaining the number of samples equal to that of luminance components without downsampling chrominance components. A format where the numbers of luminance and chrominance component samples are completely equal is called a 4:4:4 format. The conventional 4:2:0 format has been limited to Y, Cb, and Cr color space definitions because of the premise of downsampling of chrominance components. In the case of the 4:4:4 format, however, because there is no sample ratio distinction between color components, R, G, and B can be directly used in addition to Y, Cb, and Cr, and a plurality of color space definitions can be used. An example of a video encoding system targeting the 4:4:4 format is, Woo-Shik Kim, Dae-Sung Cho, and Hyun Mun Kim, “INTER-PLANE PREDICTION FOR RGB VIDEO CODING”, ICIP 2004, October 2004. (hereinafter, referred to as Non-Patent Document 2).
In a high 4:2:0 profile encoding the 4:2:0 format of AVC of the Non-Patent Document 1, in a macroblock area composed of luminance components 16×16 pixels, corresponding chrominance components are 8×8 pixel blocks for both Cb and Cr. In motion compensation prediction of the high 4:2:0 profile, block size information which becomes a unit of motion compensation prediction, reference image information used for prediction, and motion vector information of each block are multiplexed only for the luminance components, and motion compensation prediction is carried out for chrominance components by the same information as that of the luminance components. The 4:2:0 format has characteristics in color space definition that almost all pieces of structure information of an image is integrated into a (texture) luminance component, distortion visibility is lower for a chrominance component than for the luminance component, and a contribution to video reproducibility is small, and prediction and encoding of the high 4:2:0 profile are based on such characteristics of the 4:2:0 format. On the other hand, in the case of the 4:4:4 format, three color components equally hold texture information. The system for performing motion compensation prediction based on inter prediction mode, reference image information, and motion vector information depending only on one component is not necessarily an optimal method in the 4:4:4 format where the color components make equal contributions in representing a structure of an image signal. Thus, the encoding system targeting the 4:2:0 format performs different signal processing from the encoding system targeting the 4:4:4 format to execute optimal encoding, and definitions of pieces of information multiplexed in an encoded bit stream are also different. As a result, to construct a decoding device capable of decoding compressed video data of a plurality of different formats, a configuration where bit streams for signals of the formats are individually interpreted needs to be employed, thereby making a device configuration inefficient.