1. Field of the Invention
The present invention relates to content supplying apparatuses and methods and to recording media. More specifically, the present invention relates to a content supplying apparatus and method and to a recording medium, which are suitable for recording moving-picture signals on a recording medium, such as a magneto-optical disk or a magnetic tape, reading the signals, and displaying the signals on a display; suitable for transmitting moving-picture signals from a transmitting side to a receiving side through a transmission channel, in which the receiving side receives and displays the signals in the same way as in a teleconferencing system, a videophone system, a broadcasting system, a multimedia database search system, and the like; and suitable for editing and recording moving-picture signals.
2. Description of the Related Art
In a system such as a teleconferencing system or a videophone system for transmitting moving-picture signals to a remote place, image signals are compressed and encoded using line correlation between video signals and inter-frame correlation in order to efficiently make use of a transmission channel.
The Moving Picture Expert Group (MPEG) system, which is a storage moving picture coding system, is a typical high-efficiency coding system for moving pictures. This system has been discussed under the ISO-IEC/JTC1/SC2/WG11 standard and has been proposed as a draft of the standard. This system employs a hybrid system combining motion-compensated predictive coding and discrete cosine transform (DCT) coding.
In MPEG, a few profiles and levels are defined in order to serve various applications and functions. The most elementary profile/level is the main profile at main level (MP@ML).
Referring to FIG. 1, an example of the structure of an encoder conforming to the main profile at main level (MP@ML) using the MPEG system is described.
An input image signal is input to a frame memory group 1 and is encoded in a predetermined order.
Image data to be encoded is input to a motion vector detecting circuit 2 in macroblock units. The motion vector detecting circuit 2 processes image data in each frame in accordance with a predetermined sequence which is set in advance, as either an I picture, a P picture, or a B picture. The order of processing of sequentially-input images in each frame as I, P, or B picture is determined in advance (for example, the images are processed in the order I, B, P, B, P, . . . B, P).
The motion vector detecting circuit 2 refers to a predetermined reference frame which is determined in advance and performs motion compensation to detect a motion vector. The motion compensation (inter-frame prediction) includes three modes, namely, forward prediction, backward prediction, and bidirectional prediction. P pictures only employ a forward prediction mode. For B pictures, there are three prediction modes, i.e., forward prediction, backward prediction, and bidirectional prediction. The motion vector detecting circuit 2 selects a prediction mode that minimizes prediction error and generates the prediction mode.
At the same time, the prediction error is compared with, for example, a variance of a macroblock to be encoded. When the macroblock variance is smaller than the prediction error, no prediction is performed using that macroblock. Instead, intra-frame coding is performed. In this case, the intra-image coding prediction mode (intra) is used. The motion vector and the prediction mode are input to a variable-length coding circuit 6 and a motion compensation circuit 12.
The motion compensation circuit 12 generates a prediction image based on a predetermined motion vector and inputs the prediction image to an arithmetic circuit 3. The arithmetic circuit 3 outputs a differential signal between the value of the macroblock to be encoded and the value of the prediction image to a DCT circuit 4. In the case of an intra macroblock, the arithmetic circuit 3 directly outputs the signal of the macroblock to be encoded to the DCT circuit 4.
The DCT circuit 4 performs a discrete cosine transform (DCT) of the input data and converts it into DCT coefficients. The DCT coefficients are input to a quantization circuit 5 and are quantized using a quantization step corresponding to a data storage amount (buffer storage amount) of a transmission buffer 7. The quantized data is input to the variable-length coding circuit 6.
The variable-length coding circuit 6 converts image data (in this example, I-picture data) supplied from the quantization circuit 5 into a variable-length code, such as a Huffman code or the like, in accordance with the quantization step (scale) supplied from the quantization circuit 5, and the variable-length coding circuit 6 outputs the variable-length code to the transmission buffer 7.
The quantization step (scale) is input to the variable-length coding circuit 6 from the quantization circuit 5. Also a prediction mode (mode indicating which one of intra-image prediction, forward prediction, backward prediction, and bidirectional prediction has been set) and the motion vector are input from the motion vector detecting circuit 2 to the variable-length coding circuit 6. These data are also variable-length coded.
The transmission buffer 7 temporarily stores the input data and outputs data corresponding to the stored amount to the quantization circuit 5.
When a residual amount of data increases to an upper allowable limit, the transmission buffer 7 enlarges the quantization scale of the quantization circuit 5 using a quantization control signal, thus reducing the amount of quantization data. In contrast, when the residual amount of data decreases to a lower allowable limit, the transmission buffer 7 reduces the quantization scale of the quantization circuit 5 using the quantization control signal thereby increasing the amount of the quantization data. In this way, overflow or underflow of the transmission buffer 7 is prevented.
The data stored in the transmission buffer 7 is read at a predetermined time and is output to a transmission channel.
The data output from the quantization circuit 5 is input to a dequantization circuit 8 and is dequantized in accordance with the quantization step supplied from the quantization circuit 5. The output from the dequantization circuit 8 is input to an inverse discrete transform circuit (IDCT) circuit 9 and is inverse-DCT processed, and is in turn stored in a frame memory group 11 via an arithmetic unit 10.
Referring to FIG. 2, an example of the structure of a decoder at MP@ML in MPEG is described. Coded image data transmitted through a transmission channel is received by a receiving circuit (not shown) or is read by a reading unit. The data is temporarily stored in a reception buffer 31, and then is supplied to a variable-length decoding circuit 32. The variable-length decoding circuit 32 performs variable-length decoding of the data supplied from the reception buffer 31 and outputs the motion vector and the prediction mode to a motion compensation circuit 37 and outputs the quantization step to a dequantization circuit 33. In addition, the variable-length decoding circuit 32 outputs the decoded image data to the dequantization circuit 33.
The dequantization circuit 33 dequantizes the image data supplied from the variable-length decoding circuit 32 in accordance with the quantization step supplied from the variable-length decoding circuit 32 and outputs the data to an IDCT circuit 34. The data (DCT coefficients) output from the dequantization circuit 33 are inverse-DCT processed by the IDCT circuit 34 and are supplied to an arithmetic unit 35.
When the image data supplied from the IDCT circuit 34 is I-picture data, the data is output from the arithmetic unit 35. In order to generate prediction-image data for image data (P or B-picture data) input thereafter to the arithmetic unit 35, the image data is supplied to a frame memory group 36 and is stored in the frame memory group 36. The data is directly output as a read image.
When an input bit stream is a P or B picture, the motion compensation circuit 37 generates a prediction image in accordance with the motion vector and the prediction mode, which are supplied from the variable-length decoding circuit 32, and outputs the prediction image to the arithmetic unit 35. The arithmetic unit 35 adds the image data input from the IDCT circuit 34 and the prediction-image data supplied from the motion compensation circuit 37 and outputs the resulting image. When the input bit stream is a P picture, the output from the arithmetic unit 35 is input to the frame memory group 36 and is stored in the frame memory group 36, so that the data can be used as a reference image for subsequent image signals to be decoded.
In MPEG, various profiles and levels, other than MP@ML, are defined. Also, various tools are prepared. Scalability is one of the tools in MPEG.
In MPEG, a scalable coding system for implementing scalability corresponding to different image sizes and frame rates is introduced. For example, in the case of space scalability, when only decoding a bit stream at a lower layer, an image signal of a small image size is decoded. When decoding a bit stream at a lower layer and an upper layer, an image signal of a large image size is decoded.
Referring to FIG. 3, an encoder for space scalability is described. In the case of space scalability, a lower layer corresponds to an image signal of a small image size, and an upper layer corresponds to an image signal of a large image size.
An image signal at a lower layer is input to the frame memory group 1 and is encoded as in MP@ML. The output from the arithmetic unit 10 is supplied to the frame memory group 11. The output is used not only as a prediction reference image for a lower layer, but also used as a prediction reference image for an upper layer after the image is enlarged by an image enlarging circuit 41 to the size as the image size at the upper layer.
An image signal at an upper layer is input to a frame memory group 51. A motion vector detecting circuit 52 determines a motion vector and a prediction mode, as in MP@ML.
A motion compensation circuit 62 generates a prediction image in accordance with the motion vector and the prediction mode determined by the motion vector detecting circuit 52 and outputs the prediction image to a weighting circuit 44. The weighting circuit 44 multiplies the prediction image by a weight (coefficient) W and outputs the product to an arithmetic unit 43.
As described above, the output from the arithmetic unit 10 is input to the frame memory group 11 and the image enlarging circuit 41. The image enlarging circuit 41 enlarges the image signal generated by the arithmetic circuit 10 to the size of the image size at the upper layer and outputs the image signal to a weighting circuit 42. The weighting circuit 42 multiplies the output from the image enlarging circuit 41 by a weight (1-W) and outputs the product to the arithmetic unit 43.
The arithmetic unit 43 adds the outputs from the weighting circuits 42 and 44 and outputs the sum as a prediction image to an arithmetic unit 53. The output from the arithmetic unit 43 is also input to an arithmetic unit 60 and is added to the output from an IDCT circuit 59. Subsequently, the sum is input to a frame memory group 61 and is used as a prediction reference frame for subsequent image signals to be encoded.
The arithmetic unit 53 computes the difference between the image signal to be encoded and the output from the arithmetic unit 43 and outputs the difference. In the case of an intra-frame coded macroblock, the arithmetic unit 53 directly outputs the image signal to be encoded to a DCT circuit 54.
The DCT circuit 54 performs a discrete cosine transform of the output from the arithmetic unit 53, generates DCT coefficients, and outputs the DCT coefficients to a quantization circuit 55. As in MP@ML, the quantization circuit 55 quantizes the DCT coefficients in accordance with a quantization scale determined based on the data storage amount of a transmission buffer 57 or the like and outputs the quantized DCT coefficients to a variable-length coding circuit 56. The variable-length coding circuit 56 performs variable-length coding of the quantized DCT coefficients and outputs the result as a bit stream at an upper layer via the transmission buffer 57.
The output from the quantization circuit 55 is dequantized by a dequantization circuit 58 in accordance with the quantization scale used by the quantization circuit 55. The IDCT circuit 59 performs the inverse discrete cosine transform of the dequantized result, and it is in turn input to the arithmetic unit 60. The arithmetic unit 60 adds the outputs from the arithmetic unit 43 and the IDCT circuit 59 and inputs the sum to the frame memory group 61.
The motion vector and the prediction mode detected by the motion vector detecting circuit 52, the quantization scale used by the quantization circuit 55, and the weight W used by the weighting circuits 42 and 44 are input to the variable-length coding circuit 56, and are all encoded and transmitted.
In conventional moving-picture encoders and decoders, it is assumed that the units are in one-to-one correspondence. For example, in a teleconferencing system, a transmitting side and a receiving side are always in one-to-one correspondence. Processing capacities and specifications of a transmitting terminal and a receiving terminal are determined in advance. In storage media such as DVDs and the like, the specification and processing capacities of a decoder are strictly determined in advance, and an encoder encodes motion-picture signals on the assumption that only the decoder satisfying the specifications will be used. When the encoder encodes image signals so that the decoder according to the predetermined specification can achieve optimal image quality, it is always possible to transmit images having optical image quality.
However, when transmitting moving pictures to a transmission channel, such as the Internet, which has a variable transmission capacity which varies in accordance with time or path, or, when transmitting moving pictures to an unspecified number of receiving terminals of which specifications are not determined in advance and which have various processing capacities, it is difficult to know what the optimal image quality is. Hence, it is difficult to efficiently transmit moving pictures.
Since the specifications of terminals are not unique, coding systems for encoders and decoders may differ from one terminal to another. In such cases, it is necessary to efficiently convert a coded bit stream into a predetermined format. However, an optimal converting method has not yet been established.