Transmission of large amounts of temporally coherent data, such as images or video, across communication links generally requires encoding the source data. The encoding typically compresses the source data using a transform, such as a wavelet transform or lapped DCT (discrete cosine transform), that produces coefficients representing the source data. In addition, the encoding sub-samples the source data to create a number of streams, also referred to as channels. Each stream contains a set of descriptions, or packets, and represents the whole of the original source data but at a reduced fidelity. The source data may be compressed before the descriptions are generated, or the descriptions may be compressed after they are generated. One or more of the description streams are transmitted to a corresponding decoder through the link. The process of generating the descriptions is sometimes referred to as description generation or “packetization.” The packets/descriptions described herein should not be confused with packets prepared according to a particular network transmission protocol, such as TCP/IP.
Because the communication links may be unreliable, typically some error recovery technique is employed to handle description loss or corruption during transmission, thus providing robustness to the transmission. Common recovery techniques include re-transmission protocols, error correction or channel coding, and interpolation recovery. Retransmissions introduce delay and so are not favored for real-time applications. For large burst errors, error correction/channel coding does not provide sufficient protection at low bit cost. Interpolation recovery techniques recover missing data from available surrounding data but are of limited when the surrounding data is also erroneous.
A multi-resolution/layered transmission method sends the descriptions that contain important information (i.e., low pass or anchor data) with a higher priority than those containing less important information. However, because descriptions/packets may be lost at random, and the network does not look inside the packets to discriminate important from less important packets, this approach provided limited robustness.
In a more robust encoding method (MD), the multiple descriptions have equal importance. Each description is encoded independently and carries some new information about the source data. The descriptions should, in principle, complement each other, such that any number of received descriptions/packets can be used to provide some useful reconstruction of the source. In addition, the MD approach supports a wider range of applications, such as, for example, networks that do not have priority support.
Traditionally the description generation and compression process have been considered as separate operations. The order of the operations and the specifics of each results in a trade-off between compression and robustness of the encoded data. FIGS. 1A and 1B illustrate two extreme points in a compression-robustness characterization space.
System A 100 of FIG. 1A first packetizes 103 the source data 101 into descriptions 105 and then transforms 107 the descriptions 105 into compressed descriptions 109. Thus, System A operates within description boundaries. Because System A 100 involves description generation (sub-sampling) in the time domain, correlation within a description is poor, i.e., neighboring pixels within a description correspond to pixels further apart in original domain, leading to poor error free-compression (error free signal-noise ratio of 31.34 dB). However, the error pattern is individual pixel loss (see pattern 109, which reflects 25% description loss), and so is amenable to time-domain interpolation recovery methods, such as classified LS (least-squares) filters. Thus, System A 100 provides poor error-free compression but has very little error propagation from packet/description loss.
System B 120 of FIG. 1B first transforms 123 the source data 101 into a compressed form 125 and then packetizes 127 the compressed data 125 into compressed descriptions 129. Thus, System B operates across description boundaries. Because System B transforms/filters the source data as a whole, and generates descriptions in the transform domain, it provides high error-free compression (SNR 36.13), i.e., transform/filtering is effective because of high pixel correlation. However, the error pattern is very difficult to handle (see pattern 131). There are strong localized error holes (from lost of essential anchor transform data), and spreading of the error from support of transform filters across the descriptions. Error recovery from the loss of a description relies strongly on channel coding. If the transform is lapped block transform, some recovery attempt (in the transform domain) related to the overlap of the transform may be possible. Thus, System B provides good error-free compression, but has very strong error propagation from packet/description loss.