A predictive waveform encoder is a device for compressing the amount of information in a waveform (e.g., speech, image or video) by removing the statistical redundancy among its neighboring samples using prediction methods. Several ITU-T Recommendations for speech coding (ITU-T stands for the Telecommunication Standardization Sector of the International Telecommunication Union; ITU-T is formerly known as CCITT, or International Telegraph and Telephone Consultative Committee), have adopted predictive coding techniques (for example, differential pulse-code mudulation, or DPCM, is used in Recommendation G.721). In these predictive speech coders, an original speech sample is predicted based on past speech samples, and the prediction error (the difference between the original and the predicted samples), instead of the original sample, is quantized, and then digitally encoded by a noiseless coder to a bit stream. Since the energy of the prediction error is, on average, much smaller than the original speech signal, a high compression ratio can generally be obtained.
Predictive coding methods have also been used for image and video compression. In these applications, the spatial correlation among neighboring pixels in an image and, in the case of video, the temporal correlation between successive images can be exploited.
Typical predictive coders perform the prediction based on a replica of the reconstructed waveform. This ensures that the quantization error does not accumulate during reconstruction. Although the prediction accuracy is reduced (for coarse quantization), overall compression performance is generally improved.
State-of-the-art digital video coding systems utilize transform coding for spatial compression and a form of predictive coding known as motion-compensated prediction (MCP) for temporal compression. Video compression techniques that have recently been adopted in international standards (e.g., the MPEG standard developed by the International Standards Organization's Motion Picture Experts Group (ISO's MPEG) and ITU-T's H.261), or others that are under consideration for future standards, all employ a so-called block-matching MCP technique. In this method, each image in a video sequence is partitioned into N.times.N blocks, called macro blocks (MB's), where N is a predetermined integer. For each MB, a replica of the previously decoded image is searched to find an N.times.N window that best resembles that MB, and the pixels in that window are used as a prediction for that MB. The prediction error is then encoded using a combination of transform coding and scalar quantization followed by variable-length noiseless encoding.
Transcoding will be required in many applications of compressed digital video. For example, in some instances, it may be desirable to change the rate of a digital video bit stream in the network. Alternatively, when constant bit-rate (CBR) video traffic is to be carried over a cell-relay or Asynchronous Transfer Mode (ATM) network, it may be desirable to convert the CBR stream into a variable bit-rate (VBR) stream to save bandwidth through statistical multiplexing. Transcoding may also be required for conversion between two video compression formats. For example, it may be necessary to convert an MPEG-encoded video bit stream into an H.261 bit stream, or vice versa. Another important application of transcoding is multipoint video conferencing; here, transcoding may be needed to implement video mixing for continuous presence multipoint bridging.
FIG. 1, numeral 100, is a block diagram schematic of a predictive waveform encoder as is known in the art. A sequence of vectors consisting of a group of samples r.sub.i taken from an original waveform are processed to generate a sequence of quantized vectors Y.sub.i, where i=0, 1, . . . is a time index indicating the order in which the input vectors are processed. The dimensionality L of the input vectors is arbitrary. In typical speech applications L=1, whereas in many video compression applications, L&gt;1.
The encoder operates iteratively such that: (1) a predictor unit (102) generates a prediction of the input vector r.sub.i represented by the vector p.sub.i based on one or more past reconstructed vectors z.sub.j, j&lt;i, using a predetermined linear prediction operator P.sub.i ; (2) the vector p.sub.i is subtracted from r.sub.i at a first combiner (104) to obtain the prediction error vector e.sub.i =r.sub.i -p.sub.i, wherein the predictor P.sub.i is typically chosen to minimize the average energy of the prediction error e.sub.i ; (3) the prediction error vector e.sub.i is transformed by a transformation unit (106) according to E.sub.i =A.sub.i [e.sub.i ], where A.sub.i [ ] represents a linear transformation; (4) the vector E.sub.i is quantized using a quantizer Q.sub.i (108) to obtain the quantized vector Y.sub.i =E.sub.i +D.sub.i, where D.sub.i is a quantization error vector, and the quantized vector Y.sub.i is encoded into a binary word using a noiseless encoding method (e.g., a Huffman code), and then it is transmitted or stored; (5) the quantized vector Y.sub.i is then inverse transformed at Inverse Transformation Unit A.sub.i.sup.-1 (110) to find the vector y.sub.i =A.sub.i.sup.-1 [Y.sub.i ], where A.sub.i.sup.-1 [ ] is an inverse transformation (i.e., A.sub.i.sup.-1 [A.sub.i [x]]=x); and (6) the vector p.sub.i is added by a second combiner (112) to y.sub.i to obtain the reconstructed vector z.sub.i =y.sub.i +p.sub.i, which is stored for use in later iterations.
In most applications, the transformation A.sub.i is fixed a priori, i.e., is predetermined, whereas Q.sub.i and P.sub.i are varied using preselected adaptation algorithms. In some applications, the transformation A.sub.i is not used; then A.sub.i =I, where I is an LXL identity matrix. In so-called forward adaptation, the parameters of Q.sub.i, P.sub.i and A.sub.i are passed to the decoder as side information, while in so-called backward adaptation, Q.sub.i, P.sub.i and A.sub.i are determined at the decoder from previously received information, so no side information needs to be sent.
Given the information on Q.sub.i, P.sub.i and A.sub.i, a decoder can reconstruct the vector z.sub.i. The decoder (200) first recovers the quantized vectors {Y.sub.i } from the received bit stream by decoding the noiseless source code and then obtains z.sub.i. As shown in FIG. 2, numeral 200, (1) the quantized vector Y.sub.i is first inverse transformed using the inverse transformation unit A.sub.i.sup.-1 (202) to obtain y.sub.i =A.sub.i.sup.-1 [Y.sub.i ]; (2) a predictor (206) obtains the prediction p.sub.i of the input vector r.sub.i from one or more past reconstructed vectors z.sub.j, j&lt;i, using the prediction operator P.sub.i, as in the encoder; and (3) a combiner (204), operably coupled to the predictor (206) and to the transformation unit (A.sub.i.sup.-1) (202) adds the vector p.sub.i to y.sub.i to obtain the reconstructed vector z.sub.i.
The reconstructed vector z.sub.i can be represented as z.sub.i =r.sub.i +d.sub.i, where d.sub.i =A.sub.i.sup.-1 [D.sub.i ] is an inverse-transformed version of the quantization error vector D.sub.i. In other words, z.sub.i differs from the original vector r.sub.i only by d.sub.i =A.sub.i.sup.-1 [D.sub.i ]. To obtain good performance, the transformation A.sub.i is chosen such that the error A.sub.i.sup.-1 [D.sub.i ], or an appropriately weighted version of it, is kept small.
A transcoder first recovers the sequence of quantized vectors {Y.sub.i } from the received bit stream by decoding the noiseless source code, converts {Y.sub.i } into a sequence of transcoded vectors {Y.sub.i '}, and then generates a new bit stream representing {Y.sub.i '} using the noiseless source code. The transcoder has full knowledge of the operators Q.sub.i, A.sub.i and P.sub.i used at the original encoder and decoder, either a priori or through received side information.
In prior art "decode and re-encode" transcoding, a quantized vector Y.sub.i is first decoded using the decoder of FIG. 2 to obtain the reconstructed vector z.sub.i =r.sub.i +d.sub.i and then z.sub.i is re-encoded using an encoder, possibly with a different quantizer Q.sub.i ', a different predictor P.sub.i ' or even a different transformation A.sub.i ', to obtain the transcoded vector Y.sub.i '. The transcoded vector can be decoded by the decoder of FIG. 2 using Q.sub.i ', P.sub.i ' and A.sub.i '. The final reconstructed vector z.sub.i ' can then be represented as z.sub.i '=r.sub.i +d.sub.i +d.sub.i ', where d.sub.i '=(A.sub.i ').sup.-1 [Q.sub.i '] is a transformed version of the quantization error introduced by the transcoder.
Although conceptually straightforward, the implementation of the decode and re-encode method can be quite costly because of its high computational and storage requirements. Thus, there is a need for an efficient transcoding device and method that can be implemented with low complexity.