The present invention relates generally to multiple description (MD) coding of data, speech, audio, images, video and other types of signals, and more particularly to MD coding which utilizes lattice vector quantization.
Multiple description (MD) coding is a source coding technique in which multiple bit streams are used to describe a given source signal. Each of these bit streams represents a different description of the signal, and the bit streams can be decoded separately or in any combination. Each bit stream may be viewed as corresponding to a different transmission channel subject to different loss probabilities. The goal of MD coding is generally to provide a signal reconstruction quality that improves as the number of received descriptions increases, without introducing excessive redundancy between the descriptions.
By way of example, two-description MD coding is characterized by two descriptions having rates R1 and R2 and corresponding single-description reconstruction distortions D1 and D2, respectively. The single-description distortions D1 and D2 are also referred to as side distortions. The distortion resulting from reconstruction of the original signal from both of the descriptions is designated D0 and referred to as the central distortion. Similarly, the corresponding single-description and two-description decoders are called side and central decoders, respectively. A balanced two-description MD coding technique refers to a technique in which the rates R1 and R2 are equal and the expected values of the side distortions D1 and D2 are equal.
A well-known MD coding approach known as MD scalar quantization (MDSQ) is described in V. A. Vaishampayan, xe2x80x9cDesign of multiple description scalar quantizers,xe2x80x9d IEEE Transactions on Information Theory, Vol. 39, No. 3, pp. 821-834, May 1993. In an example of two-description MDSQ, a real number xxcex5 is quantized using two different scalar quantizers, and each quantizer output transmitted on a corresponding one of two different channels. If either channel is received by itself, the original number x is known within a given quantization cell of that channel. If both channels are received, the original value is known within the intersection of its quantization cell in one channel and its quantization cell in the other. In this manner, an MDSQ system provides coarse information to side decoders and finer information to a central decoder.
An MDSQ system may alternatively be viewed as a partition of a real line along with an injective mapping between partition cells and ordered pairs of indices, i.e., discrete sets of indices I1 and I2 and a map l:xe2x86x92I1xc3x97I2. A partition cell is then given by the set {xxcex5|l(x)=(i,j)} for a given ixcex5I1, jxcex5I2. The individual scalar quantizers are given by the induced projected mappings l1=xcfx801(l):xe2x86x92I1 and l2=xcfx802(l):xe2x86x92I2.
However, just as it is possible to construct single description vector quantizers that improve upon the performance of scalar quantizers, it is also possible to construct multiple description vector quantizers that out perform their scalar counterparts. In vector quantization, a given data value to be transmitted is represented as a point in a space of two or more dimensions.
Like the above-described MDSQ approach, multiple description vector quantization (MDVQ) may be viewed as discrete sets of indices I1 and I2 along with a map l:xe2x86x92I1xc3x97I2 (which induces the projected mappings I1=xcfx801(l):xe2x86x92I1 and I2=xcfx802(l):xe2x86x92I2). The partition cells given by {xxcex5|l(x)=(i,j)} for a given ixcex5I1,jxcex5I2. These cells are typically designed to be so-called Voronoi cells of some collection of points. A Voronoi cell is more generally referred to herein as a unit cell.
Although superior in performance to its scalar counterpart, general vector quantization is computationally expensive. However, significant reductions in computational complexity can be attained by organizing the data points into two or more lattices that intersect or are related as lattice and sublattice. More particularly, restricting MDVQ codebooks to lattices simplifies the necessary calculations for encoding and decoding. The problem then becomes that of choosing a lattice and designing a way of assigning the indices. The resulting coding techniques are referred to as multiple description lattice vector quantization (MDLVQ) techniques. An example of a coding technique of this type is described in S. D. Servetto, V. A. Vaishampayan, and Sloane, xe2x80x9cMultiple description lattice vector quantization,xe2x80x9d Proc. IEEE Data Compression Conf., pp. 13-22, Snowbird, Utah, April 1999, which is incorporated by reference herein. This algorithm is also referred to herein as the SVS algorithm.
Although the SVS algorithm facilitates the implementation of MDLVQ encoding, thereby allowing performance improvements relative to MDSQ encoding, this approach has a number of significant drawbacks. For example, the SVS algorithm is inherently optimized for the central decoder, i.e., for a zero probability of a lost description. In other words, an SVS encoder is designed to minimize the central distortion D0, Since MD techniques are generally only useful when both descriptions are not always received, this type of minimization is inappropriate and does not lead to optimal performance. In addition, the SVS algorithm and other known MDLVQ approaches are unduly inflexible as to the structure of the lattices. Another drawback is that there is no known technique for extending the known MDLVQ approaches to applications involving more than two descriptions.
The present invention provides improved coding techniques referred to herein as lattice-structured multiple description vector quantization (LSMDVQ) techniques.
In accordance with a first aspect of the invention, one or more lattices are configured in a manner that tends to minimize the distortion-rate performance of the system, i.e., the expected performance for a given distortion rate. An LSMDVQ encoder generates M descriptions of a signal to be encoded, each of the descriptions being transmittable over a corresponding one of M channels. The encoder in an illustrative embodiment utilizes one or more lattices configured to minimize a distortion measure which is a function of a central distortion and at least one side distortion. For example, if M=2, the distortion measure may be an average mean-squared error (AMSE) function of the form ƒ(D0, D1, D2), where D0 is a central distortion resulting from reconstruction based on receipt of both a first and a second description, and D1 and D2 are side distortions resulting from reconstruction using only a first description and a second description, respectively. In the illustrative embodiment, the above-noted distortion measure is used as the basis for a distance metric used to characterize the distance between lattice points, and a unit cell of the lattice is defined in terms of the distance metric.
In accordance with another aspect of the invention, a lattice is perturbed in order to provide further performance improvements. For example, the encoder may utilize a lattice in which the locations of the lattice points other than the points in at least one designated sublattice have been perturbed relative to a regular lattice structure based at least in part on a grouping of points into equivalence classes, with the position of a subset of the points in a given class being adjusted as part of the lattice perturbation.
Although illustrated herein using lattices, the present invention can be more generally applied to ordered sets of codebooks, e.g., an ordered set of codebooks of increasing size in which only the coarsest of the codebooks corresponds to a lattice.
In accordance with a further aspect of the invention, an extension of LSMDVQ to more than two descriptions is provided. The encoder utilizes an ordered set of M codebooks xcex91, xcex92, . . . , xcex9M of increasing size, with the coarsest codebook corresponding to a lattice. In such cases, for each number k of descriptions received, there is single decoding function that maps the received vector to a corresponding one of the codebooks xcex9k, such that reconstruction of the signal requires no more than M such decoding functions.
The LSMDVQ techniques of the invention are suitable for use in conjunction with signal transmission over many different types of channels, including lossy packet networks such as the Internet as well as broadband ATM networks, and may be used with data, speech, audio, images, video and other types of signals.