1. Field of the Invention
The present invention relates to an image conversion apparatus for converting the spatial resolution, temporal resolution or image quality of compressed image data to obtain compressed image data of a different spatial resolution, temporal resolution or image quality used for transmission or database storage of compressed images.
2. Description of the Prior Art
The massive volume of information carried by digital image signals makes high efficiency coding essential for transmitting or recording digital image signals. Various image compression techniques have therefore been proposed in recent years, and some of these techniques are also scalable. Scalability makes it possible for the user to use images at any desired spatial resolution, temporal resolution, or image quality, and thus enables, for example, both HDTV and standard TV signals to be alternatively received from a single transmission path in response to user requests.
An image coding apparatus (including image conversion apparatuses) using MPEG, a conventional scalable image coding method, is described below with reference to FIGS. 1A and 1B, block diagrams of a conventional MPEG-compatible image coding method. The image coding apparatus comprises a first image coder 7 shown in FIG. 1A, and a second image coder 8 shown in FIG. 1B. The image data size and number of pixels that can be processed by the first image coder 7 differ from those of the second image coder 8. Note that in FIGS. 1A and 1B the signal line indicated by the solid line is the video signal line, and the signal line indicated by the dotted line is the line for signals other than the video signal (including the so-called "side information" described below).
The video (image) signal is an. interlace-scanned image input in frame units. The video signal input to input terminal 70 in FIG. 1A is an uncompressed digital video signal. The input image is first input to the first resolution converter 91 shown in FIG. 1B, and converted to an image of half the resolution (number of pixels) both vertically and horizontally. The first frame coded is intraframe coded without obtaining any frame difference value. The input image data is passed through the differencer and applied to the DCT mode evaluator 82 and DCT processor 83. The DCT mode evaluator 82 detects the amount of motion in the image and determines whether frame or field unit DCT should be used by, for example, obtaining the inter-line difference in two-dimensional block units. The result of this operation is input to the DCT processor 83 as the DCT mode information.
The DCT processor 83 applies a DCT in either frame or field unit based on the DCT mode information, and converts the image data to a conversion coefficient. Quantizer 84 quantizes the conversion coefficient supplied from the DCT processor 83, and outputs to the variable length coder 85 and inverse quantizer 86. The variable length coder 85 variable length codes the quantized signal, and outputs the result through the multiplexer 93 shown in FIG. 1A to the transmission path. The quantized conversion coefficient is then inverse quantized by the inverse quantizer 86, and input to the inverse DCT processor 87. The inverse DCT processor 87 restores the input data to real-time image data, and stores the real-time image data to the frame buffer 88.
Because there is usually a high correlation between images, energy is concentrated in the conversion coefficients corresponding to the low frequency component after DCT is applied. As a result, by quantizing the high frequency components to which the human visual system (HVS) is less sensitive with large quantization noise, and quantizing the more important low frequency components with minimal quantization noise, image deterioration can be minimized, and the compressed image data size can be smaller (higher coding efficiency). In addition, the intraframe correlation is high when there is little motion in interlaced images, and when motion is large the inter-frame correlation is low but the intra-field correlation is high. It follows that interlaced images can also be efficiently coded by using this characteristic of interlaced images to appropriately switch between frame and field unit DCT processing.
Images following the intraframe coded frame- are then coded by calculating a predicted value for each frame, and then coding the difference between the actual frame value and the predicted value, i.e., coding the prediction error. The coding apparatus typically obtains the image used for predictive coding from the first resolution converter 91, and inputs the image to the motion detector 81; the motion detector 81 obtains the image motion vectors in two-dimensional block units using a common full search method.
The frame buffer 88 and motion compensator 89 then generate predicted values compensating for motion in the next frame in two-dimensional blocks using the motion vectors detected by the motion detector 81. The difference between the predicted value and the actual image input data is calculated to obtain the prediction error, which is coded with the same method used for intraframe coding. The motion vectors used for motion compensation, the motion compensation information expressing the parameters for block unit motion compensation, and the DCT mode information are applied as "side information" to the variable length coder 85, and are transferred to the decoder (not shown in the figures) through multiplexer 93 with the coded coefficient.
Because prediction error is optimally coded by means of the above image coding apparatus, the energy decreases and higher coding efficiency can be achieved in comparison with such direct image data coding schemes as intraframe coding.
In contrast with this, the first image coder 7 (FIG. 1A) is an image compression coder whereby the image resolution is not changed. The first image coder 7 shown in FIG. 1A comprises, similarly to the first image coder 7 in FIG. 1B, a motion detector 71, DCT mode evaluator 72, DCT processor 73, quantizer 74, variable length coder 75, a inverse quantizer 76, inverse DCT processor 77, adder, frame buffer 78, and motion compensator 79.
This image coder 7 essentially compression codes digital image signals in the same way as the second image coder 8, but differs from the second image coder 8 in the ability to use low resolution images to generate the predicted values. Predicted value generation itself is handled by the motion compensator 79, and to accomplish this, the low resolution image of the previous frame stored in the frame buffer 88 is input to the image resolution converter 92 (FIG. 1B) for resolution doubling in both vertical and horizontal directions. The motion compensator 79 then uses an image the same size as the original image obtained by the image resolution converter 92 as one of the predicted value candidates. The motion compensator 79 shown in FIG. 1A calculates the difference between the original image and the predicted value read from the frame buffer 78, and between the original image and the output of the second image resolution converter 92, and selects the smaller difference for input to the DCT processor 73.
By means of the first image coder 7 coding high resolution images as thus described, parts such as low resolution image areas do not need to be coded, and coding efficiency can be increased. The high and low resolution coded images are then multiplexed by the multiplexer 93 and output to the transmission path.
The decoder (not shown in the figures) obtains a low resolution image by extracting and decoding the low resolution coded image data from a single type of coded image data. In addition, by extracting both the high and low resolution coded image data for decoding, the high resolution image data can also be obtained. Depending on the situation, the user can thus switch between receiving low and high resolution images. Such resolution images are explained in MPEG-2 (ISO/IEC JTC1/SC29 N659, "ISO/IEC CD 13818-2: Information technology--Generic coding of moving pictures and associated audio information--Part 2: Video", 1993.12).
For example, multiplexer 93 produces a multiplexed signal containing a HDTV signal from the variable length coder 75 and a standard TV signal from the variable length coder 85.
According to the prior art, the image coders 7 and 8 shown in FIGS. 1A and 1B may be provided at the television broadcasting station. In this case, the television broadcasting station will need to prepare both a HDTV signal and a standard TV signal for the same television program.
Achieving scalability with the above image coding method faces the following problems, however.
First, the quality of high resolution images deteriorates. If the transmission rate of low resolution compressed image data is b, and that of high resolution compressed image data is c, the combined transmission rate a of images compressed by means of this conventional coding method will be a=b+c. Tests have shown that when the image quality of decoded images is compared between images coded using all of transmission rate a for the high resolution compressed images, and images coded using transmission rates b and c to separately code compressed image data of two different resolutions, the quality of the image coded using all of transmission rate a is clearly superior.
Second, the decoding apparatus on the receiving side becomes more complex. More specifically, it is not possible to decode high resolution compressed image data with the conventional coding method without using separate decoders for the high and low resolution images.