Generally speaking, the amount of video data is extremely huge, compared with voice or characters data, so that the real-time processing in storage or transmission becomes impossible without coding.
The coding of video data in a certain method enables them to be processed in real time during storage or transmission. For the international standards for video coding, there are currently suggested JPEG for still image, MPEG1 and MPEG2 for moving image, and MPEG4 under development for low-speed bit-rate transmission.
In video data, the amount of information practically contained and the amount of information actually used to express it are not equal, which is called redundancy of data.
Spatial redundancy is caused by the similarity of value between pixels. It is noted that when a predetermined pixel is selected, its value and other adjacent pixels' values are similar. For the processing of spatial redundancy, discrete cosine transform (DCT) is used.
Secondly, probabilistic redundancy results from the redundancy of symbols that express data. The distribution of data is not regular probabilistically, and there are frequently occurring symbols as usual. For this redundancy, entropy coding is utilized, which belongs to variable length coding.
Temporal redundancy is produced from the similarity between previous and present frame images. This is processed with motion estimation/motion compensation.
Meanwhile, with the rapid development of information/communications industry, many services, such as video on demand, tele-teaching, videoconferencing, high-definition TV, tele-diagnosis, teleshopping, are now under way or in course of preparation. If compressed video signals of these various services are to be provided using respective receivers, as many receivers as the number of services are required. In order to overcome such a drawback, scalable coding has been suggested in which the services' signals are compressed in a single mode and decoded in accordance with the respective receivers. With this scalable coding, such many services can be offered through only a receiver. The scalable coding has roughly two kinds of sub-band coding and pyramid coding, which are different in dividing an original image into smaller pieces.
FIG. 1 is a diagram of the whole configuration of a conventional scalable encoder. This encoder codes video signals input by frames into high-resolution image and low-resolution image. Intraframe coding is performed in the high-resolution frame, and then interframe coding is carried out. From now on, intraframe coding and interframe coding are explained respectively.
First, the configuration of intraframe coding includes an 8*8 block divider 11 for dividing a video signal Sin input by frames into 8*8 blocks, an 8*8 block discrete cosine transformer 12 for converting the plane domain of the video signal divided into 8*8 blocks into frequency domains through DCT transform, an 8*8 block quantizer 13 for quantizing the difference signal (a video signal without overlapped image) between the video signal (8*8 block frame) converted into frequency domains and the video signal (4*4 block frame) of block inverse compensator 33, an 8*8 block variable length coding portion 14 for encoding the quantized video signal, and then outputting the encoded signal S14 to a multiplexer 60, a 4*4 block decimator 21 for decimating 4*4 blocks of video signal from 8*8 blocks of video signal output from 8*8 block DCT 12, an energy coefficient compensator 22 for multiplying the video signal extracted into 4*4 blocks by 0.25(1/4) in order to perform energy compensation, a 4*4 block quantizer 23 for quantizing the energy-compensated video signal, a 4*4 block variable length coding portion 24 for encoding the quantized video signal, and then outputting the encoded signal S24 to multiplexer 60, a 4*4 block inverse quantizer 31 for inversely quantizing the video signal from 4*4 block quantizer 23, an 8*8 block interpolator 32 for interpolating the inversely quantized 4*4 block video signal into 8*8 blocks of video signal, using zero, an block inverse compensator 33 for inversely compensating for the energy of the interpolated video signal, an 8*8 block inverse quantizer 41 for inversely quantizing the video signal from 8*8 block quantizer 13, an 8*8 block inverse DCT 42 for performing the inverse DCT to the sum signal (an approximate signal of the video signal of 8*8 block DCT 12) between the video signal of 8*8 block inverse quantizer 41 and the video signal of block inverse compensator 33, an adder 43 for summing the video signal of 8*8 block inverse DCT 42 and the video signal (zero) of motion compensator 53, and a frame memory 44 for storing the frame signal passing through adder 43 for the purpose of interframe coding. Because the video signal of motion compensator 53 is concerned only during interframe coding, it becomes zero during intraframe coding, and during interframe coding, is a video signal of 8*8 blocks having a predetermined value.
The configuration of interframe coding in the conventional encoder is added to the aforementioned construction of intraframe coding. The interframe coding configuration includes a 16*16 block divider 51 for dividing a video signal into 16*16 blocks, a motion vector estimation portion 52 for detecting a motion vector MV from the video signal (present frame) divided into 16*16 blocks and the video signal (previous frame) of frame memory 44, and a motion compensator 53 for producing a new frame, using the motion vector MV of motion vector estimation portion 52 and the frame of frame memory 44. Additionally, there is a multiplexer 60 for selectively outputting video signal (8*8 block video signal) S14 of 8*8 block variable length coding portion 14, video signal S24 (4*4 block video signal) of 4*4 block variable length coding portion 24, and motion vector MV of motion vector estimation portion 52 in a predetermined order.
FIG. 2a is a diagram of the configuration of a conventional high-resolution decoder, FIG. 2b being of a conventional low-resolution decoder. With FIGS. 2a and 2b, there will be explained the configurations of the decoders that decode the signals encoded in the aforementioned encoder.
First of all, referring to FIG. 2a, the high-resolution decoder (related to 8*8 block image) includes a demultiplexer 111 for separately outputting input compressed video signal Sin into signals S14 and S24 of 8*8 blocks and 4*4 blocks, and into motion vector MV, an 8*8 block inverse quantizer 112 for inversely quantizing 8*8 blocks of video signal S14, a 4*4 block inverse quantizer 113 for inversely quantizing 4*4 blocks of video signal S24, an 8*8 block interpolator 114 for interpolating the 4*4 block video signal inversely quantized in 4*4 block inverse quantizer 113 into 8*8 blocks of video signal, an 8*8 block inverse DCT 115 for converting the frequency domain of the sum signal between the video signal of 8*8 block inverse quantizer 112 and the video signal of 8*8 block interpolator 114 into plane domain through inverse DCT, an adder 116 for summing the video signal converted into plane domain and the video signal of motion compensator 118, and then outputting a video signal Sout of the decoder, a frame memory 117 for storing the signal passing through adder 116 for the purpose of interframe coded data recovery, and a motion compensator 118 for compensating for the video signal stored in frame memory 117 according to the motion vector of demultiplexer 111, and then offering the compensated result to adder 116.
Turning to FIG. 2b, the low-resolution decoder includes a demultiplexer 121 for separately outputting input compressed video signal Sin into video signal S24 of 4*4 blocks and motion vector MV, a 4*4 block inverse quantizer 122 for inversely quantizing 4*4 blocks of video signal S24, a 4*4 block inverse DCT 123 for converting the frequency domain of the video signal of 4*4 block inverse quantizer 122 into plane domain through inverse DCT, a motion vector scaling portion 124 for scaling motion vector MV of demultiplexer 121, an adder 127 for summing the video signal 4*4 block inverse DCT 123 and the video signal of motion compensator 126, and then outputting a video signal Sout of the decoder, a frame memory 125 for storing the signal passing through adder 127, and a motion compensator 126 for compensating for the video signal stored in frame memory 125 according to the output signal of motion vector scaling portion 124, and then offering the compensated result to adder 127.
The conventional scalable encoder adopts pyramid coding. However, when the top left 4*4 blocks are decimated from the 8*8 block frame, the 8*8 blocks' energy is not suitable for 4*4 blocks extracted so that it needs to be compensated for.
Until now, the configuration of the conventional scalable encoder was explained in addition to the conventional decoder for reference. The scalable encoder has the following drawbacks.
In the conventional scalable encoder, energy compensation is performed by multiplying the energy value of a pixel decimated from the 8*8 block frame by a weight 0.25 (1/4). Without consideration of optimal energy for the 4*4 block frame related to the energy distribution of the 8*8 block frame, the constant weight W 0.25 is always applied. However, the energy value in a frame does not depend only on the size of frame so that inappropriately decimated image may be obtained due to the imprecise energy compensation during repeated interframe coding. An image produced with the motion vector becomes inaccurate. With that problem, as interframe coding advances, errors are accumulated, causing drift effect where an image becomes wavelike to decrease the quality of picture.