The present invention relates to an encoder, a decoder, an encoding method and a decoding method for color moving images and also a method of transferring bitstreams of color moving images.
More specifically, this invention relates to an encoder, a decoder, an encoding method and a decoding method for color moving images and also a method of transferring bitstreams of color moving images with encoding or decoding processing in an image format with decreased number of pixels (or scanning lines) in a spatially vertical direction for color-difference signals in moving-image encoding including intra-picture encoding, predictive encoding and bidirectionally predictive encoding.
Color moving-picture encoding generally processes component signals of a luminance signal and two color-difference signals in formats of images to be encoded.
The image formats are classified into the following three types: 4:2:2 format (the number of sampled color-difference signals one-half the luminance signal in a spatially horizontal direction; 4:1:1 format (the number of sampled color-difference signals one-fourth the luminance signal in the horizontal direction; and 4:2:0 format (the number of sampled color-difference signals one-half the luminance signal in the horizontal and vertical directions).
In MPEG-2 (Moving Picture Experts Group 2) standard, the 4:2:2 format has been used in encoding called 4:2:2 profile for broadcasting equipment whereas the 4:2:0 format in encoding called main profile for digital broadcasting equipment and household electronic equipment.
In each of the 4:2:2, 4:1:1 and 4:2:0 formats, the second and the third numbers indicate sampling frequencies for the color-difference signal components to 4 (the sampling number for the luminance signal at 13.5 MHz) or the ratio of two color-difference signals to the luminance signal is 2 (or 1):4.
The 4:2:0 format is not officially defined by International Telecommunication Union (ITU), in which the number of each sampled color-difference signal is one-half the luminance signal in the horizontal (the same as 4:2:2) and vertical directions.
The number of scanning lines (pixels in the vertical direction) is made one-half in the 4:2:0 format per frame to the 4:2:2 format for progressive moving-image signals. The resolution of the color-difference signals in the 4:2:0 format is thus ½ to the 4:2:2 format in both vertical and horizontal directions.
This signal resolution property in the 4:2:0 format is feasible for human visual property. Moreover, the amount of data to be processed is lightened in the 4:2:0 format. Therefore, the 4:2:0 format is the best choice for efficient encoding to progressive images.
Two sampling points have been defined for the color-difference signals: the same locations as the luminance signal, for interlaced color-difference signals in SMPTE294M standard; and the points each corresponding to the middle point between sampling points for the luminance signal, for progressive color-difference signals in MPEG-2 standard.
Nevertheless, the 4:2:0 format suffers reduction of scanning lines (the number of pixels in the vertical direction) of color-difference signals to one-half per field for interlaced moving-image signals, which results in decrease in resolution of color-difference signal in the vertical direction to one-fourth.
Illustrated in FIGS. 1A and 1B are ITU-defined 4:2:2-format sampling points and MPEG-defined 4:2:0-format sampling points, respectively, with symbols “∘” and “x” indicating luminance-signal sampling points and color-difference signal sampling points, respectively, in the vertical direction V on the time base T.
In interlaced scanning, the 4:2:2 format is a better choice for high resolution whereas the 4:2:0 format is good for less processing amount. The 4:2:0 format carries less amount of data than the 4:2:2 format, however, not so feasible due to imbalance between the amount of data and low resolution.
Luminance and color-difference signals are sampled per block of pixels in efficient encoding for motion compensation and orthogonal transform per block of pixels.
One block usually consists of (8×8) pixels, the unit of processing in orthogonal transform, in a luminance signal of (16×16) pixels, the unit of processing (macroblock) in motion compensation and adaptive-mode switching.
The 4:2:2 format has two blocks for each color-difference signal to four blocks of a luminance signal whereas the 4:2:0 format has one block for each color-difference signal to four luminance-signal blocks.
Moving-image encoding techniques, such as MPEG, process three types of pictures: I-pictures (intra-coded pictures); P-pictures (predictive-coded pictures) and B-pictures (bidirectionally predictive-coded pictures).
As one of such moving-image encoding techniques, the inventor of the present application has already invented a moving-image encoding technique disclosed in Japanese Unexamined Patent Publication Nos. 11-275591/1999 and 11-46365/1999 in which P(I)-pictures to be used as the reference pictures in inter-picture predictive encoding undergo progressive scanning whereas B-pictures not to be used as the reference pictures undergo interlaced scanning.
This moving-image encoding technique achieves high inter-picture prediction efficiency with no redundant scanning-line encoding for interlaced-scanning reproduction.
Explained below is such encoding technique with progressive scanning for P(I)-pictures and interlaced scanning for B-pictures.
An input progressive moving-image signal is separated into signal components to be encoded as P(I)-pictures and other signal components to be encoded as B-pictures.
Each P(I)-picture signal component undergoes subtraction with a predictive signal obtained through inter-picture prediction, thus a predictive error signal being produced.
The predictive error signal undergoes (8×8)-DCT (Discrete Cosine Transform) processing, and thus transformed into coefficients. The coefficients are quantized at a given step width to become fixed-length codes.
The fixed-length codes undergo inverse quantization and (8×8)-IDCT, the inverse processing of (8×8)-DCT and quantization disclosed above, thus the predictive error signal being reproduced.
The reproduced predictive error signal is added to a predictive signal, thus a local image being reproduced. The reproduced image undergoes inter-picture prediction, as a reference picture, thus a predictive signal being generated for the subtraction and addition described above.
Each progressive B-picture signal component is delayed per frame while P(I)-pictures are encoded precedingly. The delayed signal component undergoes subtraction with the predictive signal obtained through the inter-picture prediction. Scanning lines of the resultant progressive predictive error signal are decimated, thus the predictive error signal being converted into an interlaced predictive error signal.
The interlaced predictive error signal undergoes (8×4)-DCT processing per four scanning lines in the vertical direction. The resultant coefficients are quantized at a given step width to become fixed-length codes.
The fixed-length codes (predictive error signal) of P (I)-pictures and B-pictures are compressed with variable-length codes, and thus converted into a bitstream.
The 4:2:2-format sampling points under the encoding procedure described above are illustrated in FIG. 1C with symbols “∘” and “x” indicating luminance-signal sampling points and color-difference signals sampling points, respectively, in the vertical direction V on the time base T.
The encoding technique with progressive scanning for P (I)-pictures and interlaced scanning for B-pictures described above for 4:2:0-format color moving-image signals offers an appropriate resolution to progressive I-and P-pictures when processing the color-difference signals the same as the luminance signal like MPEG-2 standard.
Nevertheless, the encoding technique suffers insufficient resolution in the vertical direction for interlaced color-difference signal of B-pictures decimated per field when handling the color-difference signals the same as the luminance signal like MPEG-2 standard.
Moreover, this encoding technique suffers increase in processing amount for 4:2:2-format color moving-image signals compared to 4:4:0-format processing, and requiring large amount of data to subjective picture quality, due to excessive resolution of color-difference signals compared to luminance signal under progressive scanning.