FIELD OF THE INVENTION
The present invention relates to methods and systems for encoding and decoding picture signals and related picture signal recording media, and more particularly, relates to such methods and systems suitable for use in compressing high definition television signals (HDTV signals) and recording the compressed HDTV signals in a recording medium, such as an optical disc, magneto-optical disc or a magnetic tape, reproducing the recorded compressed HDTV signals, expanding the reproduced compressed HDTV signals to restore the normal range, and displaying the restored HDTV signals.
FIG. 1 shows a conventional picture signal encoder and a conventional picture signal decoder by way of example. The picture signal encoder includes a preprocessing circuit 1 which separates a luminance signal (Y signal) and a color difference signal (C signal) of an input video signal VD1, such as an HDTV signal. An analog-to-digital (A/D) converter 2 converts the luminance data into a corresponding digital luminance signal and stores the digital luminance signal temporarily in a frame memory 4. An A/D converter 3 converts the color difference data into a corresponding digital color difference signal and stores the digital color difference data temporarily in a frame memory 5. A format conversion circuit 6 converts the digital luminance data and the digital color difference data in frame format stored in the frame memories 4 and 5 into corresponding luminance data and color difference data in a block format, and provides the luminance data and the color difference data in block format to an encoder 7. The encoder 7 encodes the input data and supplies a bit stream representing the coded input signals to a recording medium 8, such as an optical disc, a magneto-optical disc, or a magnetic tape for recording.
A decoder 9 decodes the data reproduced from the recording medium 8 in a bit stream. A format conversion circuit 10 converts the decoded data in block format provided by the decoder 9 into corresponding decoded data in frame format. Luminance data and color difference data provided by the format conversion circuit 10 are stored respectively in frame memories 11 and 12. The luminance data and the color difference data read from the frame memories 11 and 12 are converted into an analog luminance signal and an analog color difference signal, respectively, by digital-to-analog (D/A) converters 13 and 14. A post processing circuit 15 combines the analog luminance signal and the analog color difference signal to provide an output video signal VD2 to an external circuit, not shown for purposes of simplicity and clarity.
As shown in FIG. 2, picture data representing a picture of one frame is depicted therein consisting of V lines each of H dots per inch which is sliced into N slices, i.e., a slice 1 to a slice N, each of, for example, sixteen lines, and each slice includes M macroblocks. Each macroblock comprises data blocks Y[1] to Y[4] including the luminance data of a group of 8.times.8 pixels, and data blocks Cb[5] and Cr[6] including color difference data corresponding to all the pixel data (16.times.6 pixels) of the data blocks Y[1] to Y[4].
Thus, each macroblock includes the luminance data Y[1] to Y[4] of the 16.times.16 pixel area arranged along the horizontal and vertical scanning directions as an unit for the luminance signal. The two color difference signals are time-base multiplexed after data compression and the color difference data for the 16.times.16 pixel area is allocated to the blocks Cb[5] and Cr[6] each having 8.times.8 pixels to process one unit. The picture data represented by the macroblocks are arranged successively in the slice, and the picture data represented by the blocks (8.times.8 pixels) are arranged successively in a raster scanning sequence in the macroblock (16.times.16 pixels).
The luminance data Y[1] to Y[4] and the color difference data Cb[5] and Cr[6] are transmitted in that order. The numerals in the reference characters denoting the data indicate the data's turn for transmission.
The encoder 7 compresses the received picture data and supplies the compressed picture data to the recording medium 8. The decoder 9 expands the compressed data received thereby and provides the expanded picture data to the format conversion circuit 10. The quantity of the data to be recorded in the recording medium 8 can be reduced by compression based on the line correlation and/or inter-frame correlation properties of picture signals. The line correlation property enables compression of the picture signal by, for example, discrete cosine transform (DCT).
Inter-frame correlation enables further compression of the picture signal. For example, suppose that frame pictures PC1, PC2, and PC3 are produced respectively at times t.sub.1, t.sub.2, and t.sub.3 as shown in FIG. 3. The differences between picture signals respectively representing the frame pictures PC1 and PC2 are calculated to produce a frame picture PC12, and the differences between the frame pictures PC2 and PC3 are calculated to produce a frame picture PC23. Since the differences between successive frame pictures, in general, are not very large, a signal representing such differences is small. The difference signal is coded to further reduce the quantity of data.
As shown in FIGS. 4A and 4B, a group of pictures including picture signals representing frames F1 to F17 is processed as an unit wherein each frame is encoded either as an "I picture", a "P picture" or a "B picture", as explained below. More specifically, the picture signal representing the head frame F1 is coded as an I picture, the picture signal representing the second frame F2 is coded as a B picture and the picture signal representing the third frame F3 is coded as a P picture. The picture signals representing the fourth frame F4 to the seventeenth frame F17 are coded alternately as B pictures and P pictures.
The picture signal representing the I picture is obtained by coding the picture signal representing the corresponding frame (intra-frame encoding). Basically, the picture signal representing the P picture is encoded selectively by choosing one of two modes; either of which is selected to encode each macroblock depending on which mode provides greatest efficiency. The two modes available for encoding the macroblocks of each P picture include (1) intra-frame encoding and (2) an inter-frame encoding technique in which the differences between the picture signal representing the corresponding frame and the picture signal representing the preceding I picture or P picture are encoded as shown in FIG. 4A. The picture signal representing the B picture is obtained by selectively encoding each macroblock using the most efficient one of (1) intra-frame encoding, (2) inter-frame encoding and (3) a bidirectional encoding technique in which the differences between the picture signal representing the corresponding frame and the mean of the picture signals representing the preceding and succeeding frames are encoded as indicated in FIG. 4B.
FIG. 5 is a diagrammatic view to assist in explaining the principles of a method for coding a moving picture. As shown in FIG. 5, the first frame F1 is processed as an I picture to provide data F1X on a transmission line (intra-frame coding). The second frame F2 is processed as a B picture coded to provide transmission data F2X.
As indicated above, the macroblocks of the second frame F2 as a B picture can be processed in any of a plurality of processing modes. In the first (intra-frame) processing mode, the data representing the frame F2 is coded to provide the transmission data F2X (SP1), which is the same as the processing mode for processing the I picture. In a second (inter-frame) processing mode, the differences (SP2) between the frame F2 and the succeeding frame F3 are calculated and coded for transmission in a backward predictive coding mode. In a third (also inter-frame) processing mode, the difference (SP3) between the frame F2 and the preceding frame F1 are coded for transmission in a forward predictive coding mode. In a fourth (bidirectional-predictive) processing mode, the differences (SP4) between the frame F2 and the-mean of the preceding frame F1 and the succeeding frame F3 are calculated and coded to transmit transmission data F2X. That one of the these processing modes providing the least amount of data is employed for each macroblock.
For each macroblock, a motion vector x1 representing the motion of the picture of the objective frame (F1) for the calculation of the difference data (a motion vector between the frames F1 and F2) (forward prediction) or a motion vector x2 (a motion vector between the frames F3 and F2 for backward prediction) or the motion vectors x1 and x2 are transmitted (bilateral prediction).
Difference data (SP3) representing the differences between the frame F3 of the P picture and the preceding frame F1 as a predicted picture, and a motion vector x3 are calculated, and the difference data and the motion vector x3 are transmitted as transmission data F3X (forward predictive coding mode) or the picture data (SP1) of the frame F3 is transmitted as the transmission data F3X (inter-frame coding mode). Either the forward predictive coding mode or the inter-frame coding mode that will more effective on reducing the amount of data is employed.
On the other hand, in the ISO-IEC/JTC1/SC29/WG11, an encoding method and a decoding method related COMPATIBILITY AND SCALABILITY is now examined. Scalability is achieved by spatial reduction in the peland emporal domain. Compatibility is a specific implementation of the spatial scalability. These are described in detail, on pages 125 to 137 of "Document, AVC-400 (Test Model 3)", which is issued on November 1992, by the ISO-IEC/JTC1/SC29/WG11.
However, the COMPATIBILITY AND SCALABILITY related an encoding method and a decoding method of color difference signals has not been examined in the concrete.