1. Field of the Invention
The present invention relates to an image processing apparatus and method and, more particularly, to an image processing apparatus capable of processing images with a plurality of different scalabilities by hierarchical coding.
2. Related Background Art
Conventionally, as a motion image encoding scheme, MPEG2 has been known. In this MPEG2 scheme, high image quality or a small amount of data to be transmitted can be selected with two scalabilities called spatial scalability and SNR (SN Ratio) scalability. The aim of spatial scalability in MPEG2 is to transmit one of two types of motion image information having different resolutions or image sizes. In general, only one type of image quality, i.e., one type of resolution, is selected, and data is transmitted with the selected resolution.
SNR scalability in MPEG2 targets for transmission of one of two types of motion image information having the same resolution but having different code amounts, which are obtained by, for example, quantizing DCT coefficients with different quantization steps.
In the following description of this specification, of two pieces of motion image information with different data rates in each scalability, information with a smaller data amount will be referred to as a base layer, and information with a larger data amount will be referred as an enhancement layer.
According to another known method, two types of images (base layer and enhancement layer) having different sizes (or resolutions) are simultaneously compressed/encoded by using spatial scalability, and the decoding apparatus chooses between reconstructing an image with low spatial resolution from the base layer and reconstructing an image with high resolution from the enhancement layer, depending on the performance of a decoding circuit, image display apparatus, and the like.
Spatial scalability will be described with reference to FIG. 1 by taking HDTV and NTSC image signals as examples. When an original image is an HDTV image (1440xc3x971152 pixels), a (720xc3x97576)-pixel image obtained by thinning out the original image data by xc2xd in both the X and Y directions will be referred to as a base layer. Image data obtained as a prediction (comparison) image, by encoding an image (expansion base layer) obtained by up-sampling the base layer with the same size as that of the original image, in addition to forward prediction (P) and bidirectional prediction (B) for the original image, will be referred to as an enhancement layer.
According to still another known method, two types of images (base layer and enhancement layer) having different code amounts (more specifically, quantization steps) are simultaneously compressed by using SNR scalability, and the decoding apparatus reconstructs an image with a low rate (low image quality) from the base layer. The decoding apparatus then reconstructs an image with a high rate (high image quality) from both the base layer and the enhancement layer.
SNR scalability will be briefly described below. FIG. 2 shows a conceptual rendering of SNR scalability. Two different quantization coefficients are applied to the same image to generate different pieces of compressed image information with different compression ratios from the same image. In this case, image information with a larger compression ratio, i.e., image information with a low bit rate and low image quality, is defined as a base layer, and image information with a smaller compression ratio, i.e., image information with a high bit rate and high image quality, is defined as an enhancement layer. On the decoding apparatus side, the image information of the base layer and the image information of the enhancement layer are added together to obtain an image with a small compression ratio and high image quality.
As described above, when spatial scalability or SNR scalability is to be used in MPEG2, only one of these scalabilities can be selected. A conventional encoding apparatus cannot therefore encode image data by using a plurality of scalabilities at once. In other words, only one type of scalability can be designated in one image sequence. Upon reception of image information encoded by using any one of the scalabilities, the decoding apparatus has only the following two types of image qualities as choices: a low-quality image decoded from only the base layer and a high-quality image obtained by synthesizing an image decoded from the base layer and an image decoded from the enhancement layer.
In the prior art, therefore, selection of the image quality or decoding rate cannot be done in accordance with the performance of the decoding apparatus or receiver""s need.
Obviously, the above problem is common to so-called hierarchical encoding performed for two or more types of factors as well as to MPEG2.
It is an object of the present invention to solve the above problem.
It is another object of the present invention to provide an image processing apparatus which allows selection of an arbitrary scalable factor on the receiving end and can minimize the amount of data transmitted.
In order to achieve the above objects, according to one preferred aspect of the present invention, there is provided an image processing apparatus comprising (a) means for forming a single base layer by reducing information of input image data with at least a plurality of scalable factors, and (b) means for forming a plurality of enhancement layers having information associated with a plurality of scalable factors by using the base layer.
According to another aspect of the present invention, there is provided an image processing apparatus comprising (a) reconstruction means for extracting data of a base layer obtained from input data by reducing information of image data by using at least a plurality of scalable factors, and reconstructing an image signal from the extracted data of the base layer, and (b) forming means for extracting data of a plurality of enhancement layers having information associated with a plurality of scalable factors from the input data, and forming a plurality of image signals having information associated with the plurality of factors in an amount larger than that of the reconstructed image signal of the base layer from the data of the base layer and the data of the plurality of enhancement layer.
The above and other objects, features, and advantages of the present invention will be apparent from the following detailed description in conjunction with the accompanying drawings and the appended claims.