Predictive coding is widely used for compression of digital signals (e.g., speech, image or video) by removing the statistical redundancy among neighboring samples of these waveforms. Several ITU-T Recommendations for speech coding have adopted predictive coding (for example, differential pulse-code modulation, or DPCM, is used in G.721). In these predictive speech coders, an original speech sample is predicted based on past speech samples and the prediction error, which is the difference between the original and the predicted sample, is quantized and encoded. Because the energy of the prediction error signal is much smaller than the original signal, on the average, a high compression ratio can be obtained.
Predictive coding also finds wide applications in image and video coding. In image coding, the value of a pixel can be predicted from its neighboring pixels and the prediction error is quantized and coded. In video applications, a video frame can be predicted from its preceding frames and the prediction error frame is quantized and coded.
State-of-the-art digital video compression systems use a hybrid coding method which reduces the temporal correlation by motion compensated prediction (MCP) and the spatial redundancy by Discrete Cosine Transform (DCT). Two video compression standards are based on this hybrid scheme, which are the ITU-T H.261 and the International Standards Organization's Motion Picture Experts Group (ISO MPEG). In addition, two other international standards being developed are also based on the same scheme. They are the MPEG II and ITU-T H.26p. In this hybrid coding method, a video frame is first segmented into non-overlapping square blocks of size NxN, called macroblocks (MB's), where N is a predetermined integer. For each MB, a replica is searched in the previously decoded frame to find an NxN block which minimizes a predetermined error function. Then this block is used to predict the current MB. The prediction error block and the motion vector, which is defined as the displacement of the position of the prediction block with respect to the original block, are coded and transmitted to the receiver. The prediction error block first undergoes a DCT, and then the transform coefficients are quantized, and losslessly encoded by runlength and entropy coding.
FIG. 1, numeral 100, is a block diagram schematic of a predictive waveform encoder as is known in the art. A sequence of vectors consisting of a group of samples r.sub.i taken from an original waveform are processed to generate a sequence of quantized vectors Y.sub.i, where i=0, 1, . . . , is a time index indicating the order in which the input vectors are processed. The dimension L of the input vectors is arbitrary. For typical speech applications, L=1, whereas for most video applications, L&gt;1.
The encoder operates iteratively in the following way: (1) a predictor unit (102) generates a prediction of the input vector r.sub.i represented by the vector xi based on n past reconstructed vectors Z.sub.i-n, . . . , Z.sub.i-1, where n is the order of the predictor. For typical speech applications,, n&gt;1, whereas for most video applications, n=1; (2) the vector x.sub.i is subtracted from r.sub.i at a first adder (104) to obtain the prediction error vector e.sub.i =r.sub.i -x.sub.i, wherein a predictor P.sub.i is typically chosen to minimize the average energy of the prediction error vector e.sub.i ; (3) the prediction error vector e.sub.i is transformed by a transformation unit (106) according to E.sub.i =A.sub.i [e.sub.i ], where A.sub.i [] represents a linear transformation such as DCT; (4) the vector E.sub.i is quantized using a quantizer Q.sub.i (108) to obtain the quantized vector Y.sub.i, which is encoded into a binary word using a lossless coder such as a Huffman coder, and then it is transmitted or stored; (5) the quantized vector Y.sub.i is then inverse transformed at the inverse transformation unit A.sub.i.sup.-1 (110) to obtain the vector Y.sub.i =A.sub.i.sup.-1 [Y.sub.i ], where A.sub.i.sup.-1 is the inverse transformation of A.sub.i (i.e., A.sub.i.sup.-1 [A.sub.i [x]]=x); (6) the vector x.sub.i is added to y.sub.i by a second adder (112) to obtain the reconstructed vector z.sub.i =y.sub.i +x.sub.i ; (7) the reconstructed vector z.sub.i is stored in the memory unit M.sub.i (114) for use in later iterations. The capacity of the memory is chosen such that n vectors of dimension L may be stored.
In most applications, the transformation A.sub.i is fixed a priori, i.e., is predetermined, whereas Q.sub.i and P.sub.i are varied using preselected adaptation algorithms. In some applications, the transformation A.sub.i is not used. In such a case, A.sub.i =l, where 1 is an L.times.L identity matrix. In forward adaptation, the parameters of Q.sub.i and P.sub.i are passed to the receiver as side information. On the other hand, in backward adaptation, the parameters are determined at the decoder from previously received information; hence no side information needs to be sent.
Given the information of Q.sub.i, P.sub.i and A.sub.i, a decoder can reconstruct the vector z.sub.i. FIG. 2, numeral 200, is a block diagram schematic of a decoder as is known in the art. It may be viewed as a subset of the corresponding encoder shown in FIG. 1 (100). The decoder (200) first recovers the quantized vector Y.sub.i from the received bitstream and then obtains Z.sub.i in the following way iteratively: (1) the quantized vector Y.sub.i is first inverse transformed using the inverse transformation unit A.sub.i.sup.-1 (202) to obtain vector Y.sub.i =A.sub.i.sup.-1 [Y.sub.i ]; (2) a predictor (206) generates the prediction vector xi from the past n reconstructed vectors, Z.sub.i-n, . . . , Z.sub.i-1, using the same predictor P.sub.i as in the encoder; (3) the reconstructed vector Z.sub.i is obtained by summing the two vectors Y.sub.i and X.sub.i by the adder (204); (4) the reconstructed vector Z.sub.i is stored in the memory unit Mi for future iterations. As in the encoder, the memory capacity is chosen to hold n reconstructed vectors of L dimension.
If forward adaptation is used in the encoder, the side information is also decoded and used to assist the operations of inverse quantization and the prediction.
The above decoder will operate perfectly if no error happens in the channel between the encoder and decoder pair. However, any physical channel is never perfect. If any of the information bits are damaged during the transmission, the decoder will have a problem reconstructing the vectors {z.sub.i }. For example, if the vector Y.sub.i is corrupted, then vector Y.sub.i will also be damaged. Subsequently, Z.sub.i is damaged, which will in turn lead to a wrong prediction vector X.sub.i+1 for the next iteration, and therefore a damaged Z.sub.i+1. Because of the prediction loop structure, the error will propagate forever. In order to avoid such devastation, in typical applications, some vectors are chosen to be coded directly, i.e., without using prediction. In this way, the above error propagation will stop when the decoder comes across such directly coded samples. However, such direct coding of samples will likely reduce the compression gain. Therefore, the frequency of use of such directly coded samples has to be low enough so that the sacrifice of compression gain is not significant.
In digital video compression applications, such direct coding is called intraframe coding, or simply intra coding, in contrast to another terminology called interframe coding, which uses the prediction coding as described above. The period between two intra coded frames varies for different applications. In H.261, the maximum of this period is 132 video frames.
Transmission errors can be roughly classified into two categories: random bit errors and erasure errors. A random bit error is caused by the imperfection of physical channels, which results in inversion, insertion or deletion of an information bit. Erasure errors, on the other hand, include such information loss as cell loss in packet switched networks (e.g., Asynchronous Transfer Mode, or ATM, networks) and burst error in storage media due to physical defects. Therefore, a random bit error usually leads to information damage, whereas a erasure error leads to information loss at the receiver. Since Variable Length Coding (VLC, for example, Huffman coding)is usually used to exploit the statistical redundancy among symbols to be transmitted, a single bit error can lead to many following information bits being undecodable, hence useless, until the next synchronization symbol. Therefore, a random bit error in VLC can also be thought of as one kind of erasure error.
The state-of-the-art for information loss protection and recovery include Automatic Repeat request (ARQ), error control coding (ECC) and error concealment. In the ARQ method, the transmitter keeps a copy of the transmitted information with a predetermined memory size. When the receiver detects information damage or loss, it sends a request to the transmitter for retransmission of the damaged/lost portion of information. Although ARQ has been quite successful in data communication, it has generally been thought to be inappropriate for services demanding real-time and/or interactive signal delivery because of the involved delay. ECC combats transmission errors by adding redundancy to the bitstream to be transmitted in a controlled way, such that some of the bit errors can be detected and, in some cases, corrected. While it might be effective to protect against random bit errors, use of ECC to protect erasure errors is extremely difficult, if not impossible. For example, in ATM networks, to protect the loss of a cell containing several hundreds of bits, data interleaving has to be performed and substantial redundancy has to be added. This will not only reduce the compression gain, but will also increase the hardware complexity and processing delay. Finally, error concealment is a technique which tries to conceal the effect of information loss by interpolating the lost samples from the neighboring correctly received samples. The reconstructed signal quality usually depends on the content of the original signal and the complexity of the applied algorithm.
Thus, there is a need for a device and method that provide efficient signal loss recovery for realtime and/or interactive communications.