The present invention relates to a multimedia communication system under the influence of transmission errors in communication channels.
Environment for video communication using a wireless mobile terminal is being well-ordered. Once it was said that the development of such a communication terminal was difficult because of the technical constraint of three items of little channel capacity (a low transmission bit rate), a large transmission error rate and little battery capacity (equivalent to low computational power). However, the technical constraint of the three items is being surmounted by the development of high bit rate mobile telephone systems represented by IMT-2000, image coding methods with high compression rates and high error resilience represented by MPEG-4 and the development of high-performance batteries and a low-power consumption processors.
FIG. 1 shows an example of the configuration of an encoder 100 conforming to the international standard MPEG-4. In MPEG-4, a hybrid coding method (inter-/intra-frame adaptable coding) of block matching and discrete cosine transform (DCT) is adopted. A subtractor 102 calculates difference between an input image signal (the original image signal of the current frame) 101 and an image signal 113 (described later) output from an inter-/intra-frame coding selection switch 119 and outputs an error image signal 103. The error image signal is quantized by a quantizer 105 to be quantized DCT coefficients after it is converted to DCT coefficients in a DCT converter 104. The quantized DCT coefficients are output to a communication channel as transmitted information and simultaneously, are used for synthesizing an inter-frame prediction image signal in the encoder.
Next, a procedure for synthesizing the prediction image signal will be described. The prediction image signal is synthesized in a block 107 enclosed by an alternate long and short dash line in the encoder 100. The quantized DCT coefficients 106 become a decoded error image signal 110 (the same image signal as an error image signal reconstructed in a receiver) via an inverse quantizer 108 and an IDCT converter 109. The image signal 113 (described later) output from the inter-/intra-frame coding selection switch 119 is added to this signal in an adder 111 and a decoded image signal 112 (the same signal as a decoded image signal of the current frame reconstructed in the receiver) of the current frame is obtained. The image signal is once stored in a frame memory 114 and is delayed by time equivalent to one frame. Therefore, currently, the frame memory 114 outputs a decoded image signal 115 of the previous frame. The decoded image signal of the previous frame and the input image signal 101 of the current frame are input to a block matching module 116 and a block matching process is executed there. In the block matching module, a prediction image signal 117 of the current frame is synthesized by dividing each image signal into plural blocks and extracting the most similar part to the original image signal of the current frame every block from the decoded image signal of the previous frame. At this time, processing for detecting how much each block moves between the previous frame and the current frame (a motion estimation process) is required to be executed. The motion vectors of the blocks detected by the motion estimation process is transmitted to the receiver as motion vector information 120. The receiver can synthesize the same prediction image signal as the prediction image signal obtained at the transmitter based upon the motion vector information and the decoded image signal of the previous frame. The prediction image signal 117 is input to the inter-/intra-frame coding selection switch 119 together with zero 118. The switch switches inter-frame coding and intra-frame coding by selecting one of these input signals. In case the prediction image signal 117 is selected (FIG. 2 shows this case), inter-frame coding is executed. On the other hand it zero is selected, the input image signal is coded using DCT as it is and is output to the communication channel, executing intra-frame coding. For the receiver to acquire a correctly decoded image signal, it is required to be known whether inter-frame coding or intra-frame coding was executed at the transmitter. Therefore, an inter-/intra-frame coding distinction flag 121 is output to the communication channel. A final H.263 coded bit stream 123 is obtained by multiplexing the quantized DCT coefficients, the motion vector and information in the inter-/intra-frame coding distinction flag in a multiplexer 122.
FIG. 2 shows an example of the configuration of a decoder 200 that receives the coded bit stream 123 output from the encoder 100 shown in FIG. 1. A received MPEG-4 bit stream 217 is demultiplexed into quantized DCT coefficients 201, motion vector information 202 and an intra-/inter-frame coding distinction flag 203 in a demultiplexer 216. The quantized DCT coefficients 201 become a decoded error image signal 206 via an inverse quantizer 204 and an IDCT converter 205. An image signal 215 output from an inter-/intra-frame coding selection switch 214 is added to the error image signal in an adder 207 and a decoded image signal 208 is output. The inter-/intra-frame coding selection switch 214 switches its output according to the inter-/intra-frame coding distinction flag 203. A prediction image signal 212 used in the case of inter-frame coding is synthesized in a prediction image signal synthesizer 211. Here, the processing for moving the position of each block is executed for a decoded image signal 210 of the previous frame stored in the frame memory 209 according to the received motion vector information 202. In the meantime, in the case of intra-frame coding, the inter-/intra-frame coding selection switch outputs zero 213 as it is.
An image signal coded according to MPEG-4 is composed of one luminance plane (a Y plane) having luminance information and two chrominance planes (a U plane and a V plane) having chrominance information. When an image signal includes 2 m pixels horizontally and 2 n pixels vertically (m and n are a positive integer), the image signal is characterized in that the Y plane has 2 m pixels horizontally and 2 n pixels vertically and the U and V planes have m pixels horizontally and n pixels vertically. The reason why the resolution of the chrominance plane is low as described above is that a human visual system is relatively insensitive to the spatial variation of chrominance. Encoding and decoding are executed for such an input image signal in units of a block called a macroblock in MPEG-4.
FIG. 3 shows the configuration of the macroblock. The macroblock is composed of three blocks of the Y block, the U block and the V block, the Y block 301 having luminance value information is composed of “16×16 pixels”, and the U block 302 and the V block 303 having chrominance information are both composed of “8×8 pixels”. In MPEG-4, intra-frame coding and inter-frame coding are switched in units of macroblock. In the block matching process, a motion vector can be transmitted every macroblock.
In wireless communication, it is almost impossible to prevent a transmission error from happening in a transmission channel. In the meantime, coded data compressed using to information compression techniques is weak against bit errors (the inversion of bits in the data) and the inversion of a few bits may cause the decoder to go out of control (a state in which a reconstruction process is stopped, a state in which the reception of input information from a user is not accepted, etc.) or serious deterioration of reconstructed information. Generally, in data communication, it is often assumed that no error occurs in transmitted data or an extremely small bit error rate (probability that a bit included in data is inverted) is achieved. This can be done by utilizing an error correction code such as Reed-Solomon code or a retransmission protocol that make a transmitter to retransmit corrupted packets. However, the use of the error correction code and the retransmission protocol causes substantial deterioration of the bit rate of transmitted data and the increase of transmission delay, and is not necessarily a suitable solution in real time communication at low bit rates.
Due to the reasons described in the previous paragraph, in low bit rate image communication in wireless environment, where it is supposed that bit errors occur frequently in a coded bit stream received by a receiver, error resilience coding techniques for minimizing the deterioration of a reconstructed image obtained by decoding the bit stream are necessary. A simplest example of such error resilience coding technique is to increase the rate of macroblocks using intra-frame coding. In inter-frame coding, where a reconstructed image of the previous frame is used for the decoding of the current frame, the deterioration in a certain frame remains in the succeeding frames. To prevent this phenomenon from occurring, the rate of macroblocks using intra-frame coding should be increased so that the deterioration caused in the previous frame may hardly be taken over in the succeeding frames. However, the increase of the rate of macroblocks using intra-frame coding generally causes the deterioration of coding performance (the quality of decoded images when the bit rate is fixed). That is, in case the error resilience of a coded bit stream is enhanced using the method described above, the quality of a reconstructed image when no transmission error occurs is conversely deteriorated.
In an image coding method defined in the international standard MPEG-4, the following three types of error resilience coding techniques are further adopted.
(1) Resync Marker:
In a coded bit stream conforming to MPEG-4, 16 or more bits of zeroes 0 are not continuously arranged except in special code words. There are two such special code words: a video object plane (VOP) start code indicating the Starting Point of a VOP (which means “frame” in MPEG-4); and a resync marker. The resync marker is a code word intentionally transmitted by an encoder in a transmitter to enhance the error resilience of a coded bit stream. The resync marker can be inserted immediately before the coding information of any macroblock except the first macroblock of a VOP. If a decoder finds that a coded bit stream includes an error, the next resync marker is searched and if the decoding is restarted from data succeeding it, at least the data after the resync marker can be correctly decoded. The encoder in the transmitter can adjust the frequency in which resync markers are inserted into a coded bit stream by to its own decision. If the frequency in which a resync marker is inserted is increased, error resilience is obviously enhanced. However, as reverse effect, since coding performance is usually deteriorated by the transmission of redundant information, the quality of a reconstructed image in case no transmission error occurs is deteriorated.
(2) Data Partitioning:
It is known that when a motion vector of a macroblock is decoded erroneously because of a transmission error, more seriously deterioration is caused in a reconstructed image, compared with a case where an error occurs in the information related to DCT coefficients. Knowing this, the concept of data partitioning is that the important information included in the macroblocks between two resync markers is gathered and transmitted before the relatively un important information. The reason for this is that the probability that the information transmitted immediately after a resync marker is influenced by a transmission error is lower than such probability for the information transmitted immediately before the resync marker.
The following three types of information are given priority in data partitioning: type information of macroblocks; motion vector information (only in the case of inter-frame coding); and DC DCT coefficients (only in the case of intra-frame coding). For example, when information related to five macroblocks for inter-frame coding is located between resync markers, five motion vectors are transmitted after the type information of the five macroblocks is first transmitted and finally, DCT coefficients related to the five macroblocks are transmitted. when the transmission frequency of resync markers is high, the enhancement of error resilience by data partitioning is intensified. Therefore, when error resilience is enhanced, coding performance will be deteriorated.
(3) Reversible Variable Length Code (VLC):
In MPEG-4, variable length code in which code length varies according to the generated information is used. However, generally, data coded using variable length code can be decoded only unidirectionally. That is, after finding the starting point of coded data correctly, the coded data is required to be decoded in the order of transmission. However, by using special techniques, a variable length code which can be also decoded from a reverse direction (a reversible variable length code) can be designed. By using the reversible variable length code decoding in reverse direction from the end point of a bit stream become possible. This makes it possible to decode corrupted bit stream that cannot be decoded in forward direction, minimizing the amount of lost information.
In MPEG-4, reversible variable length code can be used for coding DCT coefficients. Even if the reversible variable length code is used, it is still true that the starting point (the ending point) of data must be found correctly. Moreover, reversible variable length code is not applicable to the coding of motion vectors and the types of macroblocks. Therefore, to enhance error resilience, the transmission frequency of resync markers is required to be increased and this deteriorates coding performance.
The deterioration of the quality of a decoded image when the coded bit stream is corrupted can be greatly reduced by using the above error resilience techniques. However, attention is required to be paid in that the above error resilience technique has a common property that when error resilience is enhanced, coding performance is deteriorated. That is, when error resilience is enhanced more than necessary, the quality of a reconstructed image in a receiver may be conversely deteriorated. The deterioration (the emergence of a pattern not found in an original image, etc.) of image quality caused by bit errors and the deterioration (blurry image, etc.) of image quality caused by the reduction of coding performance are generally different in characteristics. There is a personal taste in the relative level of disturbance that a viewer of a reconstructed image feels from the above two types of deterioration and even if the bit error rates are the same, the optimum level of error resilience is often different depending upon the preference of the viewer.