The present invention relates to methods and systems for encoding and decoding picture signals and related picture signal recording media. For example, the present invention relates to such methods and systems suitable for use in compressing high definition television signals (HDTV signals) and recording the compressed HDTV signals in a recording medium, such as an optical disk, magnetooptic disk or a magnetic tape, reproducing the recorded compressed HDTV signals, expanding the reproduced compressed HDTV signals to restore the normal range and displaying the restored HDTV signals.
FIG. 1 shows a conventional image signal encoder and a conventional image signal decoder by way of example. The image signal encoder includes a preprocessing circuit 1 which separates a luminance signal (Y signal) and a color difference signal (C signal) of an input image signal VD, such as an HDTV signal. An AD converter 2 converts the luminance signal into a corresponding digital luminance signal and stores the digital luminance signal temporarily in a frame memory 4. An AD converter 3 converts the color difference signal into a corresponding digital color difference signal and stores the digital color difference signal temporarily in a frame memory 5. A format conversion circuit 6 converts the digital luminance signal and the digital color difference signal in frame format stored in the frame memories 4 and 5 into corresponding luminance data and color difference data in a block format, and provides the luminance data and the color difference data in block format to an encoder 7. The encoder 7 encodes the input signals and supplies a bit stream representing the coded input signals to a recording medium 8, such as an optical disk, a magnetooptic disk or a magnetic tape for recording.
A decoder 9 decodes the data reproduced from the recording medium 8 in a bit stream. A format conversion circuit 10 converts the decoded data in block format provided by the decoder 9 into corresponding decoded data in frame format. Luminance data and color difference data provided by the format conversion circuit 10 are stored respectively in frame memories 11 and 12. The luminance data and the color difference data read from the frame memories 11 and 12 are converted into an analog luminance signal and an analog color difference signal, respectively, by D/A converters 13 and 14. A post processing circuit 15 combines the analog luminance signal and the analog color difference signal to provide an output image signal to an external circuit, not shown for purposes of simplicity and clarity.
As shown in FIG. 2, image data representing a picture of one frame is depicted therein consisting of V lines each of H dots per inch which is sliced into N slices, i.e., a slice 1 to a slice N, each of, for example, sixteen lines, and each slice includes M macroblocks. Each macroblock comprises data blocks Y[1] to Y[4] including the luminance data of a group of 8.times.8 pixels, and data blocks Cb[5] to Cr[6] including color difference data corresponding to all the pixel data (16.times.16 pixels) of the data blocks Y[1] to Y[4].
Thus, each macroblock includes the image data Y[1] to Y[4] of the 16.times.16 pixel area arranged along the horizontal and vertical scanning directions as a unit for the luminance signal. The two color difference signals are time-base multiplexed after data compression and the color difference data for the 16.times.16 pixel area is allocated to the blocks Cr[6] and Cb[4] each having 8.times.8 pixels to process one unit. The image data represented by the macroblocks are arranged successively in the slice, and the image data represented by the blocks (8.times.8 pixels) are arranged successively in a raster scanning sequence in the macroblock (16.times.16 pixels).
The image data Y[1] to Y[4] and the color difference data Cb[5] and Cr[5] are transmitted in that order. The numerals in the reference characters denoting the data indicate the data's turn for transmission.
The encoder 7 compresses the received image data and supplies the compressed image data to the recording medium 8. The decoder expands the compressed data received thereby and provides the expanded image data to the format conversion circuit 10. The quantity of the data to be recorded in the recording medium 8 can be reduced by compression based on the line correlation and/or interframe correlation properties of image signals. The line correlation property enables compression of the image signal by, for example, discrete cosine transformation (DCT).
Interframe correlation enables further compression of the image signal. For example, suppose that frame pictures PC1, PC2 and PC3 are produced respectively at times t.sub.1, t.sub.2 and t.sub.3 as shown in FIG. 3. The differences between image signals respectively representing the frame pictures PC1 and PC2 are calculated to produce a frame picture PC12, and the differences between the frame pictures PC2 and PC3 are calculated to produce a frame picture PC23. Since the differences between successive frame pictures, in general, are not very large, a signal representing such differences is small. The difference signal is coded to further reduce the quantity of data.
As shown in FIGS. 4A and 4B, a group of pictures including image signals representing frames F1 to F17 is processed as a unit wherein each frame is encoded either as an "I picture", a "P picture" or a "B picture", as explained below. More specifically, the image signal representing the head frame F1 is coded as an I picture, the image signal representing the second frame F2 is coded as a B picture and the image signal representing the third frame F3 is coded as a P picture. The image signals representing the fourth frame F4 to the seventeenth frame F17 are coded alternately as B picture and P pictures.
The image signal representing the I picture is obtained by coding the image signal representing the corresponding frame (intraframe encoding). Basically, the image signal representing the P picture is encoded selectively by choosing one of two modes; either of which is selected to encode each macroblock depending on which mode provides greatest efficiency. The two modes available for encoding the macroblocks of each P picture include (1) intraframe encoding, and (2) an interframe encoding technique in which the differences between the image signal representing the corresponding frame and the image signal representing the preceding I picture or P picture are encoded as shown in FIG. 4(A). The image signal representing the B picture is obtained by selectively encoding each macroblock using the most efficient one of (1) intraframe encoding, (2) interframe encoding, and a bidirectional encoding technique in which the differences between the image signal representing the corresponding frame and the mean of the image signals representing the preceding and succeeding frames are encoded as indicated in FIG. 4B.
FIG. 5 is a diagrammatic view to assist in explaining the principles of a method for coding a moving picture. As shown in FIG. 5, the first frame F1 is processed as an I picture to provide data F1X on a transmission line (intraframe coding). The second frame F2 is processed as a B picture coded to provide transmission data F2X.
As indicated above, the macroblocks of the second frame F2 as a B picture can be processed in any of a plurality of processing modes. In the first (intraframe) processing mode, the data representing the frame F2 is coded to provide the transmission data F2X (SP1), which is the same as the processing mode for processing the I picture. In a second (interframe) processing mode, the differences (SP2) between the frame F2 and the succeeding frame F3 are calculated and coded for transmission in a backward predictive coding mode. In a third processing mode (also interframe), the difference (SP3) between the frame F2 and the preceding frame F1 are coded for transmission in a forward predictive coding mode. In a fourth (bidirectional-predictive) processing mode, the differences (SP4) between the frame F2 and the mean of the preceding frame F1 and the succeeding frame F3 are calculated and coded to transmit transmission data F2X. That one of the these processing modes providing the least amount of data is employed for each macroblock.
For each macroblock, a motion vector x1 representing the motion of the picture of the objective frame (F1) for the calculation of the difference data (a motion vector between the frames F1 and F2) (forward prediction) or a motion vector x2 (a motion vector between the frames F3 and F2 for backward prediction) or the motion vectors x1 and x2 are transmitted (bilateral prediction).
Difference data (SP3) representing the differences between the frame F3 of the P picture and the preceding frame F1 as a predicted picture, and a motion vector x3 are calculated, and the difference data and the motion vector x3 are transmitted as transmission data F3X (forward predictive coding mode) or the picture data (SP1) of the frame F3 is transmitted as the transmission data F3X (intraframe coding mode). Either the forward predictive coding mode or the interframe coding mode that will more effective for reducing the amount of data is employed.
Low-resolution image data can be obtained by compressing (thinning out or reducing) high-resolution image data, such as high definition television data, by half with respect to both the vertical and horizontal directions. The aspect ratio of the low-resolution image data can be changed from 16:9 to 4:3, and the low-resolution image can be reproduced with an NTSC type display.
When displaying a high-resolution picture after compressing the same to 1/4 the original amount (1/2.times.1/2), the decoder 9 shown in FIG. 1 is configured, for example, as shown in FIG. 6. In this case, the image data is compressed by the encoder 7 in a discrete cosine transformation (DCT) mode.
Image data (DCT coefficients) obtained by DCT of the image data in units each of 8.times.8 pixel blocks is provided to the extraction circuit 21 of the decoder 9, and then the extraction circuit 21 extracts 8.times.8 data as shown in the chart of FIG. 7.
The 8.times.8 data d(i,j) represent DCT coefficients. In FIG. 7, the frequency of the picture components each corresponding with a given DCT coefficient as shown therein increase as the position of the given DCT coefficient is shifted toward the lower side of the chart, and the frequencies of the components likewise increase as the position of the DCT coefficient is shifted from left to right in the chart.
An extracting circuit 22 disposed after the extracting circuit 21 extracts the 4.times.4 DCT coefficients shown in FIG. 8 corresponding to the DC and lower-frequency AC components of the 8.times.8 DCT coefficients shown in FIG. 7. Thus, the 8.times.8 pixel data is thinned out by half in each of the horizontal and vertical directions to produce 4.times.4 coefficient data. The DCT coefficients shown in FIG. 8 are the 4.times.4 DCT coefficients in the upper left-hand portion of FIG. 7. The 4.times.4 DCT coefficients extracted by the extracting circuit 22 are supplied to an inverse discrete cosine transformation circuit (IDCT circuit) 23 for inverse discrete cosine transformation to obtain image data in 4.times.4 pixel groups having one-half resolution with respect to both the horizontal and vertical directions.
FIG. 9 is a block diagram showing a circuit configuration of the encoder 7 by way of example.
Macroblocks of image data to be coded, such as the image data of a high-resolution HDTV picture, are applied to a motion vector detecting circuit 50. The motion vector detecting circuit 50 processes the image data of each frame as an I picture, a P picture or a B picture according to a specified sequence. The mode of selection of an I picture, P picture or a B picture in processing the image data of the frames sequentially applied to the motion vector detecting circuit 50 is determined beforehand. For example, in one sequence the group of frames F1 to F17 are processed as I pictures, P pictures and B pictures, respectively, as shown in FIGS. 4A and 4B.
The image data of the frame to be processed as an I picture, for example, the frame F1, is transferred to and stored in a forward original image area 51a of a frame memory 51, the image data of a frame to be processed as a B picture, for example, the frame F2, is transferred to and stored in reference original image area 51b of the frame memory 51, and the image data of a frame to be processed as a P picture, for example, the frame F3, is transferred to and stored in a backward original image area 51c of the frame memory 51.
When the image of a frame to be processed as a B picture (frame F4) or P picture (frame F5) is provided in each cycle to the motion vector detecting circuit 50, the image data of the first P picture (frame F3) stored in the backward original image area 51c is transferred to the forward original image area 51a, the image data of the next B picture (frame F4) is stored (overwritten) in the reference original image area 51b, the image data of the next P picture (frame F5) is stored (overwritten) in the backward original image area 51c. These operations are repeated sequentially.
The image data of the pictures stored in the frame memory 51 are read therefrom, and then a frame/field mode switching circuit 52 processes the image data in a frame encoding mode or a field encoding mode. An arithmetic unit (prediction circuit) 53 operates under control of an encoding mode selecting circuit 54 for intraimage prediction, forward prediction, backward prediction or bilateral prediction. The selection of a predictive coding mode is dependent on a prediction error signal representing the difference between the objective reference original image and the corresponding predicted image. Accordingly, the motion vector detecting circuit 50 produces the sum of absolute values or the sum of squares of prediction error signals for use in the selection of the prediction mode.
The operation of the frame/field mode switching circuit 52 for selecting either the frame encoding mode or the field encoding mode will be described hereinafter.
When the frame encoding mode is selected, the circuit 52 transfers the four luminance blocks Y[1] to Y[4] given thereto from the motion vector detecting circuit 50 as they are to the arithmetic unit 53. In this case, as shown in FIG. 10(A), each luminance block has, in combination, both the data representing the lines of odd fields and that representing the lines of even fields. In the frame encoding mode, the four luminance blocks forming each macroblock are processed as a unit, and a single motion vector is determined for the four luminance blocks.
When the field encoding mode is selected, the circuit 52 changes the luminance blocks Y[1] and Y[2] from an arrangement as shown in FIG. 10(A) as received from the motion vector detecting circuit 50, for example, into pixels of lines of odd fields, changes the other luminance blocks Y[3] and Y[4] into pixels in lines of even fields as shown in FIG. 10(B), and provides an output signal in the form as shown in FIG. 10(B) to the arithmetic unit 53. In this case, an odd field motion vector corresponds to the two luminance blocks Y[1] and Y[2], while an even field motion vector corresponds to the other two luminance blocks Y[3] and Y[4].
The motion vector detecting circuit 50 provides signals representing the sum of absolute values of prediction errors for interframe and bidirectional predictive encoding, as well as a measure of the amount of data resulting from intraframe encoding when operating in the frame encoding mode, and similarly derived signals in the field encoding mode to the circuit 52. For simplicity, the foregoing data are referred to from time to time as prediction errors herein. The circuit 52 compares the sums of absolute values of prediction errors in the frame encoding mode and the field encoding mode, carries out a selection process corresponding to the mode having the smaller sum, and supplies the selected data to the arithmetic unit 53.
Practically, the process of arranging the data is carried out by the motion vector detecting circuit which supplies the data in an arrangement corresponding to the selected mode to the prediction mode switching circuit 52, and then the prediction mode switching circuit 52 provides the input signal as is to the arithmetic unit 53.
In the frame encoding mode, the color difference signal having, in combination, both data representing the lines of the odd fields and data representing lines of the even fields as shown in FIG. 10(A) are supplied to the arithmetic unit 53. In the field encoding mode, the respective upper halves (four lines) of the color difference blocks Cb[5] and Cr[6] are rearranged to include a color difference signal representing odd fields corresponding to the luminance blocks Y[1] and Y[2], and the respective lower halves (four lines) of the color difference blocks Cb[5] and Cr[6] are rearranged to include a color difference signal representing even fields corresponding to the luminance blocks Y[3] and Y[4] as shown in FIG. 10(B).
The motion vector detecting circuit 50 produces the sum of absolute values of prediction errors for use for determining a prediction mode for intraimage encoding, forward prediction, backward prediction and bilateral prediction for each macroblock by means of the prediction mode selecting circuit 54.
The difference between the absolute value .vertline..SIGMA.Aij.vertline. of the sum .vertline.Aij.vertline. of the signals Aij of a macroblock of a reference original image and the sum .vertline..SIGMA.Aij.vertline. of the absolute values .vertline.Aij.vertline. of the signals Aij of the macroblocks is calculated as the sum of absolute values of prediction errors for intraimage encoding. The sum .SIGMA..vertline.Aij-Bij.vertline. of the absolute values .vertline.Aij-Bij.vertline. of the differences (Aij-Bij) between the signals Aij of the macroblock of the reference original image and the signals Bij of the macroblock of a predicted image is calculated as the sum of absolute values of prediction errors for forward prediction. The sums of absolute values of prediction errors for backward prediction and bilateral prediction are calculated in a similar manner, using predicted images different from that used for the calculation of the sum of absolute values of prediction errors for forward prediction.
These sums of absolute values are given to the prediction mode selecting circuit 54. The prediction mode selecting circuit 54 selects the smallest sum of absolute values of prediction errors among those given thereto as the sum of absolute value of prediction errors for inter-prediction, compares the smallest sum of absolute values of prediction errors and the sum of absolute values of prediction errors for intraimage prediction, selects the smaller sum of absolute values of the prediction errors, and selects an encoding mode corresponding to the selected smaller sum of absolute values of prediction errors; that is, the intraimage encoding mode is selected if the sum of absolute values of prediction errors for intraimage encoding is smaller, and, to the extent that these predictive encoding modes may be used (depending on the type of picture encoded), the forward prediction mode, the backward prediction mode or the bilateral prediction mode corresponding to the smallest sum of absolute values of prediction errors is selected if the sum of absolute values of prediction errors for inter-prediction encoding is smaller.
Thus, the motion vector detecting circuit 50 supplies the signals representing the macroblocks of the reference original image and having an arrangement as in FIG. 10(A) or FIG. 10(B) corresponding to the prediction mode selected by the circuit 52, i.e., either the frame encoding mode or the field prediction mode, through the circuit 52 to the arithmetic circuit 53, detects a motion vector between a predicted image corresponding to the encoding mode selected by the circuit 54 among those four modes and the reference original image, and gives the detected motion vector to a variable-length encoding circuit 58 and a motion compensating circuit 64. As mentioned above, a motion vector that makes the corresponding sum of absolute values of prediction errors smallest is selected.
The prediction mode selecting circuit 54 sets an intraframe (image) encoding mode, in which motion compensation is not performed, as an encoding mode while the motion vector detecting circuit 50 is reading the image data of an I picture from the forward original image area 51a, and connects the movable contact 53d of the switch of the arithmetic unit 53 to the fixed contact a thereof. Consequently, the image data of the I picture is applied to a DCT mode switching circuit 55.
The DCT mode switching circuit 55 provides data representing four luminance blocks having, in combination, lines of odd fields and those of even fields as shown in FIG. 11(A) (i.e., in a frame DCT mode) or data representing four luminance blocks each having lines of either an odd field or those of an even field as shown in FIG. 11(B) (in a field DCT mode) to a DCT circuit 56.
The DCT mode switching circuit 55 compares the coding efficiency of the frame DCT mode and that of the field DCT mode, and selects the DCT mode which provides better coding efficiency than the other by producing less data.
For example, the DCT mode switching circuit 55 produces a frame DCT mode data estimate by forming the data representing blocks having, in combination, lines of odd fields and those of even fields as shown in FIG. 11(A), calculating the differences between signals representing the vertically adjacent lines of odd fields and even fields, and calculating the sum of absolute values of the differences (or the sum of squares of the differences). The circuit 55 also produces a field DCT mode data estimate by forming the data representing blocks of lines of odd fields and those of lines of even fields as shown in FIG. 11(B), calculating the differences between the vertically adjacent lines of odd fields and those between the vertically adjacent lines of even fields, and calculating the sum of absolute values (or the sum of squares) of the former differences and the sum of absolute values (or the sum of squares) of the latter differences. The circuit 55 then compares the former and latter sums of absolute values, and selects a DCT mode corresponding to the smaller sum of absolute values; that is, the frame DCT mode is selected when the former sum of absolute value is smaller and the field DCT mode is selected when the latter sum of absolute value is smaller.
When the circuit 52 selects the frame encoding mode (FIG. 10(A)) and the DCT mode switching circuit 55 also selects the frame DCT mode (FIG. 11(A)), and as well when the circuit 52 selects the field encoding mode (FIG. 10(B)) and the DCT mode switching circuit 55 also selects the field DCT mode (FIG. 11(B), the DCT mode switching circuit 55 need not change the arrangement of the data.
When the circuit 52 selects the field encoding mode (FIG. 10(B)) and the DCT mode switching circuit 55 selects the frame DCT mode (FIG. 11(A)), and as well when the circuit 52 selects the frame encoding mode (FIG. 10(A)) and the DCT mode switching circuit 55 selects the field DCT mode (FIG. 11(B)), the DCT mode switching circuit 55 rearranges the data. The circuit 52 provides a frame/field encoding flag indicating either the frame encoding mode or the field encoding mode to the DCT mode switching circuit 55 to instruct the DCT mode switching circuit 55 whether and how to rearrange the data.
The DCT mode switching circuit 55 provides data arranged according to the selected DCT mode to the DCT circuit 56 and supplies a DCT flag indicating the selected DCT mode to the variable-length coding circuit 58 and an inverse discrete cosine transformation circuit (IDCT) 61.
The arrangement of the data in the luminance blocks are substantially the same in the frame and field modes as determined by the circuit 52 (FIGS. 10(A) and 10(B), and the DCT mode switching circuit 55 (FIGS. 11(A) and 11(B).
When the circuit 52 selects the frame encoding mode, in which the blocks have both odd lines and even lines in combination, it is highly probable that the DCT mode switching circuit 55 will select the frame DCT mode, in which each of the blocks has only odd lines and even lines in combination. When the prediction mode switching circuit 52 selects the frame prediction mode, in which each of the blocks has odd lines or even lines, it is highly probable that the DCT mode switching circuit 55 will select the field DCT mode, in which the data of odd fields and that of even fields are separated from each other.
However, the DCT mode switching circuit 55 does not always select either the frame DCT mode or the field DCT mode in such a manner since the prediction mode switching circuit 52 determines the mode so that the sum of absolute values of prediction errors is the smallest, while the DCT mode switching circuit 55 determines the mode so that coding can be achieved with high efficiency.
The DCT mode switching circuit 55 provides image data representing an I picture to the DCT circuit 56 and the image data is transformed into DCT coefficients by DCT (discrete cosine transformation). The DCT coefficients are quantized at a quantizing step based on the amount of data stored in a transmission buffer memory 59 by a quantizing circuit 57, and the quantized DCT coefficients are supplied to the variable-length coding circuit 58.
The variable-length coding circuit 58 converts the image data (in this case, the data of the I picture) received from the quantizing circuit 57 into variable-length codes, such as Huffman codes, according to the quantizing step (scale) used for quantization by the quantizing circuit 57, and provides the variable-length codes to the transmission buffer memory 59.
The quantized DCT coefficients in the low-frequency range (DCT coefficients typically representing large power levels) are in the upper left-hand corner of the table of 8.times.8 DCT coefficients as shown in FIG. 7 due to the characteristics of DCT. Generally, a coefficient is coded in a combination of a run length of successive zeros (zero-run length) and a coefficient (level) by variable-length coding. A coding method using zero-run lengths and levels in combination is called a run length coding method. When coefficients are coded by a run length coding method, long zero-runs can be formed by transmitting the coefficients in zigzag scanning sequence as shown in FIG. 12, in which the numerals indicate the coefficients' sequence of transmission, so that the data can be compressed.
The variable-length coding circuit 58 also variable length encodes the quantizing step (scale) provided by the quantizing circuit 57, the encoding mode (intraimage mode, forward prediction mode, backward prediction mode or bilateral prediction mode) selected by the prediction mode selecting circuit 54, the motion vector determined by the motion vector detecting circuit 50, the frame/field encoding flag set by the circuit 52, and the DCT flag (frame DCT mode flag or field DCT mode flag) set by the DCT mode switching circuit 55 together with the zigzag scanned quantized data.
After storing the transmission data temporarily, the transmission buffer 59 sends out the transmission data in a bit stream at a constant bit rate and controls the quantizing scale by sending a quantization control signal corresponding to the amount of the residual data for each macroblock to the quantizing circuit 57. The transmission buffer memory 59 thus regulates the amount of data sent out in a bit stream in order to hold an appropriate amount of data (amount of data that will not cause overflow or underflow) therein.
For example, upon an increase in the amount of the residual data held in the transmission buffer memory 59 to an upper limit, the transmission buffer memory 59 provides a quantization control signal to increase the quantizing scale to be used by the quantizing circuit 57 so that the amount of quantized data produced by the quantizing circuit 57 will be decreased. Upon a decrease in the amount of the residual data held in the transmission buffer memory 59 to a lower limit, the transmission buffer memory 59 provides a quantization control signal to decrease the quantizing scale to be used by the quantizing circuit 57 so that the amount of quantized data produced by the quantizing circuit 57 will be increased.
The output bit stream of the transmission buffer memory 59 is combined with a coded audio signal, synchronizing signals and the like to produce a multiplexed signal, an error correction code is added to the multiplexed signal, the multiplexed signal is subjected to predetermined modulation, and then the modulated multiplexed signal is recorded in pits on a master disk with a laser beam controlled according to the modulated multiplexed signal. A stamping disk for duplicating the master disk is formed by using the master disk to mass-produce records, such as optical disks.
The data of the I picture provided by the quantizing circuit 57 is inversely quantized by an inverse quantizing circuit 60 at a step provided by the quantizing circuit 57. The output of the inverse quantizing circuit 60 is subjected to IDCT (inverse DCT) in an IDCT circuit 61, and the output of the IDCT circuit 61 is provided to the converting circuit 65. The converting circuit 65 converts the input data from the IDCT circuit 61 according to the DCT flag provided by the DCT mode switching circuit 55 and the frame/field encoding flag provided by the circuit 52 into the frame encoding mode format (FIG. 10(A)) or data of field encoding mode (FIG. 36(B)) so that the converted data matches the predicted image data provided by the motion compensating circuit 64, and then converted data is supplied to an adding circuit 62. Data provided by the adding circuit 62 is converted to the frame encoding mode format (FIG. 10(A)) according to the frame/filed encoding flag by a conversion circuit 66, and then the converted data is stored in a forward predicted image area 63a of a frame memory 63.
The frame memory 63 may be replaced by a field memory. When a field memory is used instead of the frame memory 63, the output data of the adding circuit 62 is converted into the field encoding mode format (FIG. 10(B)) by the converting circuit 66, because the data of each field is stored separately.
When sequentially processing input frames as, for example, I, B, P, B, P, B . . . pictures, the motion vector detecting circuit 50 processes the image data of the first input frame as an I picture, and then processes the image data of the third input frame as a P picture before processing the image data of the second input frame as a B picture, because the B picture requires backward prediction and the B picture cannot be decoded without using the P picture, i.e., to produce a backward predicted image.
After processing the I picture, the motion vector detecting circuit 50 starts processing the image data of the P picture stored in the backward original image area 51c and, as mentioned above, the motion vector detecting circuit 50 supplies the sum of absolute values of the interframe differences (prediction errors), and the corresponding intraframe value, for each macroblock to the circuit 52 and the prediction mode selecting circuit 54. The circuit 52 and the prediction mode selecting circuit 54 set a frame/field encoding mode for each macroblock as intraimage encoding or forward prediction, according to the sum of absolute values of prediction errors (and the corresponding intraframe value) for each macroblock of the P picture.
When the intraframe encoding mode is set, the movable contact 53d of the circuit 53 is connected to the fixed contact a. Consequently, the data, similarly to the data of the I picture, is provided through the DCT mode switching circuit 55, the DCT circuit 56, the quantizing circuit 57, the variable-length coding circuit 58 and the transmitting buffer memory 59 to a transmission line. The data is also supplied through the inverse quantizing circuit 60, the IDCT circuit 61, the converting circuit 65, the adding circuit 62 and the converting circuit 66 to the backward predicted image area 63b of the frame memory 63 for storage.
When the forward prediction mode is set, the movable contact 53d of the circuit 53 is connected to the fixed contact b, and the motion compensating circuit 64 reads the data of the I picture from the forward predicted image area 63a of the frame memory 63 and executes motion compensation according to a motion vector provided by the motion vector detecting circuit 50. When the prediction mode selecting circuit 54 selects the forward prediction mode, the motion compensating circuit 64 shifts the read address for a position corresponding to the macroblock being provided by the motion vector detecting circuit 50 in the forward predicted image area 63a according to the motion vector, reads the data from the forward predicted image area 63a and produces predicted image data. The motion compensating circuit 64 arranges the predicted image data in either the frame/field arrangement shown in FIG. 10A or 10B according to the frame/field encoding flag provided by the circuit 52.
The predicted image data provided by the motion compensating circuit 64 is provided to a subtracting circuit 53a. The circuit 53a subtracts the predicted image data of a macroblock given thereto by the motion compensating circuit 64 from the data of the corresponding macroblock of a reference original image provided by the circuit 52, and provides difference or prediction error data representing the differences between the received data through the DCT mode switching circuit 55, the DCT circuit 56, the quantizing circuit 57, the variable-length coding circuit 58 and the transmitting buffer memory 59 to the transmission line. The difference data is locally decoded by the inverse quantizing circuit 60, the IDCT circuit 61 and the converting circuit 65, and the locally decoded difference data is supplied to the adding circuit 62.
The predicted image data provided to the arithmetic unit 53a is supplied also to the adding circuit 62. The circuit 62 adds the predicted image data provided by the motion compensating circuit 64 to the difference data provided by the converting circuit 65 to reproduce the image data of the original (decoded) P picture. Since the image data of the original P picture is in one of the arrangements shown in FIGS. 10(A) and 10(B) by the circuit 52, a converting circuit 66 rearranges the image data according to the frame encoding mode as shown in FIG. 10(A) (or according to the field encoding mode shown in FIG. 10(B) when the memory 63 is instead a field memory) according to the frame/field encoding flag. The image data of the P picture is stored in the backward predicted image area 63b of the frame memory 63.
After the image data of the I picture and that of the P picture have been thus stored respectively in the forward predicted image area 63a and the backward predicted image area 63b, the motion vector detecting circuit 50 processes a B picture. The circuit 52 and the prediction mode selecting circuit 54 sets either the frame encoding mode or the field encoding mode as described above for each macroblock, and the circuit 54 sets the intraframe encoding mode, the forward prediction mode, the backward prediction mode or the bilateral prediction mode.
As mentioned above, when the intraframe mode or the forward prediction mode is set, the movable contact 53d is connected to the fixed contact a or b, respectively, and then the same process as that carried out for the P picture is carried out and data is transmitted.
When the backward prediction mode or the bilateral prediction mode is set, the movable contact 53d is connected to the fixed contact c or d, respectively.
When the movable contact 53d is connected to the fixed contact c for the backward prediction mode, the image data P picture or I picture is read from the backward predicted image area 63b, and the image data is motion compensated by circuit 64 according to a motion vector provided by the motion vector detecting circuit 50. When the backward prediction mode is set by the prediction mode selecting circuit 54, the motion compensating circuit 64 shifts the read address of the data in the backward predicted image area 63b based on the motion vector from a position corresponding to the position of a macroblock being provided by the motion vector detecting circuit 50, reads the data, produces predicted image data, and rearranges the data according the frame/field encoding to flag provided by the circuit 52.
The motion compensating circuit 64 supplies the predicted image data to a subtracting circuit 53b. The circuit 53b subtracts the predicted image data provided by the motion compensating circuit 64 from the data of the macroblock in the reference original image provided by the circuit 52 to obtain difference data representing the differences between the image data. The difference data is provided through the DCT mode switching circuit 55, the DCT circuit 56, the quantizing circuit 57, the variable-length coding circuit 58 and the transmitting buffer memory 59 to the transmission line.
When the movable contact 53d is connected to the fixed contact d in the bilateral prediction mode, the I or P picture data is read from the forward predicted image area 63a and the I or P picture data is read from the backward predicted image area 63b, and then the data of each image are motion compensated by the circuit 64 according to the motion vectors provided by the motion vector detecting circuit 50. When the prediction mode selecting circuit 54 sets the bilateral prediction mode, the motion compensating circuit 64 shifts the read addresses in the forward predicted image area 63a and the backward predicted image area 63b from positions corresponding to the position of the macroblock being provided by the motion vector detecting circuit 50 according to two motion vectors for the forward predicted image and the backward predicted image, respectively, reads data from the forward predicted image area 63a and the backward predicted image area 63b, and produces predicted image data. The predicted image data is rearranged according to the flag provided by the circuit 52.
The motion compensating circuit 64 supplies the predicted image data to a subtracting circuit 53c. The circuit 53c subtracts the mean of the predicted image data provided by the motion compensating circuit 64 from the data of the macroblock of the reference original image provided by the motion vector detecting circuit 50 to provide difference data through the DCT mode switching circuit 55, the DCT circuit 56, the quantizing circuit 57, the variable-length coding circuit 58 and the transmitting buffer memory 59 to the transmission line.
The image of the B picture is not stored in the frame memory 63 because the same is not used for forming predicted images.
When necessary, the banks of the forward predicted image area 63a and the backward predicted image area 63b of the frame memory 63 can be changed to provide the stored data for producing a forward predicted image and a backward predicted image, respectively, of a specified reference original image.
Although the encoder 7 has been explained as applied mainly to processing the luminance blocks, the macroblocks of the color difference blocks as shown in FIGS. 10(A) and 10(B), 11(A) and 11(B) can be similarly processed and transmitted. A motion vector for processing the color difference block is one half the motion vector of the corresponding luminance block with respect to both the vertical direction and the horizontal direction.
The decoder 9 will be described hereinafter with reference to FIG. 13. An input bit stream representing image data stored in the recording medium 8 of FIG. 1 such as an optical disk, is provided to the decoder 9. The input bit stream is transmitted through a receiving buffer memory 70 to a variable-length decoding circuit (IVLC) 71 for decoding to obtain quantized data (DCT coefficients), motion vectors, a prediction flag, a DCT flag and a quantization scale. The data (DCT coefficients) provided by the variable-length decoding circuit 71 is inversely quantized by an inverse quantizing circuit 72 to provide representative data. The step of inverse quantization is regulated according to the quantization scale provided by the variable-length decoding circuit 71.
For each block, 8.times.8 blocks of quantized reproduced values (DCT coefficients) are provided by the inverse quantizing circuit 72. An IDCT circuit 73 inverse quantizes the 8.times.8 coefficients blocks to obtain corresponding blocks each having 8.times.8 pixels. The output of the IDCT circuit 73 is rearranged according to a DCT flag and a prediction flag provided by the variable-length decoding circuit 71 by means of a converting circuit 77 in an arrangement coinciding with the arrangement of data provided by a motion compensating circuit 76. The output of the converting circuit 77 is supplied to an arithmetic unit 74.
When image data representing an I picture is supplied to an adding circuit 74, the adding circuit unit 74 gives the image data to a converting circuit 78, which in turn rearranges the image data according to the frame/field encoding flag provided by the variable-length decoding circuit 71 in the frame encoding mode as shown in FIG. 10(A), and in the field encoding mode as shown in FIG. 36(B) when a field memory is used instead of a frame memory 75, and stores the image data in the forward predicted image area 75a of the frame memory 75 to use the same for producing predicted image data for the next image received by the adding circuit 74. The image data is also supplied to a format conversion circuit 10 (FIG. 1).
When image data for a forward predicted P picture produced from the image data of the preceding frame is supplied to the circuit 74, the image data representing the preceding frame, an I picture, is read from the forward predicted image area 75a of the frame memory 75, subjected to motion compensation in a motion compensating circuit 76 according to a motion vector provided by the variable-length decoding circuit 71, processed according to the encoding mode for the respective macroblock and the image data is rearranged as shown in FIG. 10(A) or FIG. 10(B) according to the frame/field encoding flag. The circuit 74 adds the rearranged data and image data (difference data) provided by the converting circuit 77. The added data, i.e., the decoded image data of the P picture, is stored in the backward predicted image area 75b of the frame memory 75 for producing predicted image data for the next image received by adding circuit the 74 (that is, the image data of either a B or P picture).
When the image data representing a P picture has been intraimage encoded, the image data, similarly to that an I picture, is not processed by the circuit 74 but is stored in the backward predicted image area 75b as it is.
The P picture timewise succeeds a respective B picture, but is processed and transmitted before the B picture. Since the P picture is to be displayed after the B picture which is next received, the image data representing the P picture is not provided to the format converting circuit 10 at this time.
When the image data representing the next B picture is provided by the converting circuit 77, depending on the prediction mode signal supplied for each macroblock by the circuit, either the image data of the I picture stored in the forward predicted image area 75a of the frame memory 75 (in the forward prediction mode) or the image data of the P picture stored in the backward predicted image area 75b (in the backward prediction mode) is read or both images (in the bilateral prediction mode) are read. The image data read from memory is processed by the motion compensating circuit 76 according to the motion vector provided by the variable-length decoding circuit 71 and the motion-compensated image data is rearranged according to the frame/field encoding flag to produce a predicted image. When motion compensation is unnecessary (intraimage mode), a predicted image is not produced.
The arithmetic unit 74 adds the output of the converting circuit 77 and the motion-compensated image data and the sum produced by the circuit 44 is supplied to the format converting circuit 10 after restoration to line-sequential format according to the frame/field flag by the converting circuit 78.
Since the data supplied by circuit 74 is a B picture and is not used for predicting other images, this output is not stored in the frame memory 75.
After the image of the B picture has been provided, the image data of the P picture is read from the backward predicted image area 75b and provided through the motion compensating circuit 76 (without motion compensation), the circuit 74 and the converting circuit 78 (without rearrangement).
The color difference signals can be processed in a similar manner. When processing the color difference signals, a motion vector which is one-half the motion vector used for processing the luminance signal both with respect to the vertical direction and the horizontal direction, is used.
The image thus reproduced is subjected to D/A conversion to obtain a decoded high-resolution HDTV signal.
The decoder 9 shown in FIG. 13 is provided with a configuration comprising an inverse quantizing circuit 81, a converting circuit 89 and the associated circuits for obtaining a decoded quarter-resolution image (that is, a standard TV image) in addition to the configuration for obtaining the decoded HDTV signal. The inverse quantizing circuit 81 obtains representative data by the inverse quantization of the data provided by the variable-length decoding circuit 71 according to a quantizing scale provided by the variable-length decoding circuit 71 and supplies the representative data to a selecting circuit 82.
The selecting circuit 82 selects the 4.times.4 DCT coefficients shown in FIG. 8 from the 8.times.8 DCT coefficients shown in FIG. 7 to obtain quarter-resolution 4.times.4 DCT coefficient groups by thinning out or reducing the 8.times.8 pixel data in both the vertical direction and the horizontal direction.
An IDCT circuit 83 inverse transforms the input 4.times.4 DCT coefficients and provides the recovered data to the converting circuit 88. The converting circuit 88 rearranges the data according to the DCT flag and the frame/field encoding flag provided by a motion compensating circuit 86 so that the arrangement of the data coincides with the arrangement of a predicted image provided by the motion compensating circuit 86. The data rearranged by the converting circuit 88 is provided to an adding circuit 84. The motion compensating circuit 86 changes the banks of a frame memory 85 and motion-compensates the data stored in the frame memory 85 according to a prediction mode and a motion vector provided by the variable-length decoding circuit 71, and rearranges the image data according to the field/frame encoding flag as shown in the indicated one of FIG. 10(A) or FIG. 10(B) to produce predicted image data.
The data provided by the motion compensating circuit 86 is added to the output of the converting circuit 88 by the circuit 84. The data is rearranged for the frame encoding mode (or for the field encoding mode when a field memory is used instead of the frame memory 85) according to the frame/field encoding flag by the converting circuit 89 to provide standard TV image data. Since motion in the standard TV image is about half that in corresponding HDTV image, the motion vector provided by the variable-length decoding circuit 71 is reduced by half in a scaling circuit 87 and the reduced motion vector is supplied to the motion compensating circuit 86.
When the prediction mode selecting circuit 54 of the encoder 7 (FIG. 9) selects the frame encoding mode for the encoder 7, the decoder 9 is set to decode data encoded in the frame encoding mode. When the field encoding mode is selected for the encoder 7, the decoder 9 is correspondingly set. Thus, when the motion compensating circuit 64 of the encoder 7 (as well as the motion compensating circuit 76 of the decoder 9) produces a predicted image in the frame encoding mode, the motion compensating circuit 86 of the decoder 9 produces a predicted image in the frame prediction mode in cooperation with the scaling circuit 87 and, when the motion compensating circuit 64 (as well as the motion compensating circuit 76) produces a predicted image in the field prediction mode, the motion compensating circuit 86 produces a predicted image corresponding with the field encoding mode in cooperation with the scaling circuit 87.
The foregoing conventional image signal decoder uses motion vectors obtained simply by reducing motion vectors provided for decoding high-resolution image data, by means of the scaling circuit 87 for motion compensation when producing a low-resolution image. Accordingly, the data of a predicted macroblock produced by the motion compensating circuit 86 does not coincide exactly with the data of the corresponding macroblock obtained by DCT of data motion-compensated by means of the motion compensating circuit 64 of the encoder 7 (or by means of the motion compensating circuit 76 of the decoder 9), followed by extraction of the low-frequency components of the output of the motion compensating circuit 64 and subjecting the low-frequency components to IDCT. Therefore, drift occurs between the data produced by the motion compensating circuit 86 and the data obtained by the encoder 7, producing a mismatching error and, consequently, the data obtained by processing the data provided by the motion compensating circuit 86 by the adding circuit 84, i.e., the quarter-resolution SDTV image data, does not coincide exactly with the original HDTV image data.
If the interval between intrapictures is increased, the interval between the successive mismatching error resetting operations is increased, the mismatching errors accumulate in the frame memory 85, and the drift appears as noise on the screen which degrades the quality of the picture.