In a conventional international standard video coding method, such as ISO/IEC 14496-10|ITU-T H.264 (referred to as AVC/H.264 from here on), a method of compressing image data in units of block data (referred to as a macroblock from here on), which is a combination of a luminance signal of 16×16 pixels and two color difference signals of 8×8 pixels corresponding to the luminance signal, on the basis of a motion-compensated prediction technology, and an orthogonal transformation/transform coefficient quantization technology is used. In a motion-compensated prediction, a motion vector search and generation of a prediction image are carried out in units of a macroblock by using an already-coded forward or backward picture as a reference image. A picture on which inter-frame prediction coding is carried out by referring to only a single picture is called a P picture, and a picture on which inter-frame prediction coding is carried out by simultaneously referring to two pictures is called a B picture.