(1) Field of the Invention
The present invention relates to a picture coding method for efficiently compressing moving pictures and to a picture decoding method for correctly decoding the compressed moving pictures.
(2) Description of the Related Art
In the age of multimedia which integrally handles audio, video and other pixel values, existing information media (e.g., newspaper, magazine, television, radio, telephone and other means) through which information is conveyed to people, has recently been included in the scope of multimedia. Generally, multimedia refers to something that is represented by associating not only characters, but also graphics, audio and especially pictures and the like together. However, in order to include the aforementioned existing information media into the scope of multimedia, it is necessary to represent such information in digital form.
However, when calculating the amount of information contained in each of the aforementioned information media as the amount of digital information, the amount of information per character is 1 to 2 bytes in the case of characters, while the amount of information to be required is 64 Kbits per second in the case of audio (telephone quality), and 100 Mbits per second in the case of moving pictures (current television reception quality). Therefore, it is not realistic for the aforementioned information media to handle such an enormous amount of information as it is in digital form. For example, although video phones are already in practical use by using Integrated Services Digital Network (ISDN) which offers a transmission speed of 64 Kbits/s to 1.5 Mbits/s, it is not possible to transmit video of televisions and cameras directly through ISDN.
In this circumstance, information compression techniques have become required, and moving picture compression techniques compliant with H.261 and H.263 standards recommended by ITU-T
(International Telecommunication Union-Telecommunication Standardization Sector) are employed for video phones, for example. Moreover, according to information compression techniques compliant with the MPEG-1 standard, it is possible to store picture information into an ordinary music CD (compact disc) together with audio information.
Here, MPEG (Moving Picture Experts Group) is an international standard on compression of moving picture signals standardized by ISO/IEC (International Organization for Standardization, International Electrotechnical Commission), and
MPEG-1 is a standard for compressing television signal information approximately into one hundredth so that moving picture signals can be transmitted at a rate of 1.5 Mbit/s. Furthermore, since the intended quality is a middle-quality realized by a transmission speed of chiefly about 1.5 Mbit/s in MPEG-1 standard, MPEG-2, which has been standardized with a view to satisfying requirements for further improved picture quality, allows data transmission equivalent in quality to television broadcasting through which moving picture signals are transmitted at a rate of 2 to 15 Mbit/s. Moreover, MPEG-4 has been standardized by a working group (ISO/IEC JTC1/SC29/WG11) which promoted the standardization of MPEG-1 and MPEG-2. MPEG-4, which provides a higher compression ratio than that of MPEG-1 and MPEG-2 and which enables an object-based coding/decoding/operation, is capable of providing a new functionality required in this age of multimedia. At the beginning stage of standardization, MPEG-4 was aimed at providing a low bit rate coding method, but it has been extended as a standard supporting more general coding that handles interlaced images as well as high bit rate coding. Currently, an effort has been made jointly by ISO/IEC and ITU-T for standardizing MPEG-4 AVC and ITU-T H.264 as picture coding methods of the next generation that offer a higher compression ratio. And these are approved as international standards as of June 2004.
As for picture coding, coding is generally performed per block, and mostly the size of the block is 16 pixels per unit. Actually the picture size allowing coding is in multiples of 16, which is a multiple of the number of pixels of per block. However, for example, the number of pixels for picture signals of HDTV is 1920 in a horizontal and 1080 in vertical. However, 1080 is not a multiple of 16 inconveniently. Because of this, coding is performed per block, which results in clipping out of the coded (decoded) picture for outputting (display on a screen).
FIG. 1 is a drawing to describe a display area of pictures. In FIG. 1, the number of horizontal pixels is represented by MBWidth and the number of vertical pixels is represented by MBHeight for coding (encoding) pictures. And also a black circle denotes a pixel outputted by decoding apparatus and white circle denotes a pixel to be coded but not outputted by decoding apparatus. In order to indicate pixels to be outputted among the coded pixels, the number of horizontal pixels, which is the Width of the area to be outputted, and the number of vertical pixels, which is the Height of the area to be outputted are represented by the number of left pixels LCrop, right pixels RCrop, top pixels TCrop and bottom pixels BCrop. Here following equalities can be obtained:Width=MBWidth−Lcrop−RcropHeight=MBHeight−TCrop−Bcrop.
Picture signals are usually represented by a luminance and a chrominance. A human's ability for discriminating resolution of chrominance is relatively weak comparing to discriminating of luminance. Therefore, compressing efficiency is improved by making the number of luminance smaller than the number of chrominance. Generally the ratio of the number of luminance to the number of chrominance is relatively small for general consumer products while the ratio is close to one for professional products.
FIG. 2A, FIG. 2B and FIG. 2C are drawings to show color format of pictures. In the drawings a white circle denotes a pixel location of luminance and a black circle denotes a pixel location of chrominance. FIG. 2A is 4:2:2 color format, FIG. 2B is 4:2:2 color format and FIG. 2C is 4:4:4 color format.
It should be noted that in the case of a component picture signal represented by RGB, wherein as green includes a large amount of luminance components, a white circle denotes G (green) and a black circle denotes R (red) and B (blue).
FIG. 3 is a drawing to show a data structure of bitstream. The bitstream Str is comprised of PixelData which is each pixel value datum, and CommonData which is a common data of frame or plural of frames. The CommonData includes a color format ChromaFormat and an output area coding information CropData. The color format ChromaFormat indicates, for example, any one of 4:2:0, 4:2:2 or 4:4:4. The output area coding information CropData indicates, for example, the number of left pixels LCrop, right pixel RCrop, top pixels TCrop and bottom pixels BCrop.
FIG. 4A and FIG. 4B are drawings to show a variable-length code table. FIG. 4A is an example of a table of variable-length code table of color format ChromaFormat. FIG. 4B shows an output area coding information CropData. It is an example of a variable-length code table for coding the each value (Value) of the number of left pixels LCrop, right pixels RCrop, top pixels TCrop and bottom pixels BCrop. When the value (Value) is larger, the code length is longer, and more number of bits is necessary.    (see ITU-T Rec.H264|ISO/IEC 14496-10 version 1 “Information technology—Coding of audio-visual objects—Part 10:Advanced video coding”, non-patent literature 1)
Now, the number of chrominance pixels is less than the number of luminance pixels. Thus, the number of luminance pixels which can be outputted is actually an integral multiple. For example in FIG. 2A, two pixels in horizontal direction and two pixels in vertical direction of luminance corresponds to one pixel of chrominance, and the number of luminance pixels and the location of pixels that can be outputted are multiples of two in both horizontal and vertical directions. Because of this, each value (Value) of the number of left pixels LCrop, right pixels RCrop, top pixels TCrop and bottom pixels BCrop becomes an even number. On the other hand, in the case of FIG. 2C, either an even number or an odd number can be possible. However, in the case that the even number is the only possible case, if the values are coded by the way in FIG. 4B substantially, the coding efficiency can not be expected, since the odd numbers, not to be coded, are also included in the case of FIG. 2A.
If a coding supposing only the color format ChromaFormat like FIG. 2A (for example the non-patentable literature 1), the number of left pixels LCrop, right pixels RCrop, top pixels TCrop and bottom pixels BCrop are multiplied by ½ respectively, and then the value is coded using the table in FIG. 4B. By this way, for example RCrop=4 needs originally 5 bits of “00101”, but now coding can be possible with 3 bits of “011” from RCrop/2=2, as a result, 2 bits can be saved. However, only even position can be displayed with this way in the case like FIG. 2C.