This invention generally relates to a system and method for lossy data compression. More particularly, this invention relates to a system that, given a source sequence drawn from a source alphabet, finds a distorted sequence drawn from a reproduction alphabet. The distorted sequence has the same length as the source sequence and is highly compressible, and does not differ from the source sequence more than a certain specified distortion level.
Lossy data compression algorithms are of significant academic and commercial interest owing to broadband applications such as streaming multimedia, images, audio, cellular voice, and text, and may also be useful in discrete filtering of noisy finite alphabet sequences. At issue is the transmission of a binary sequence from a source to a destination. If the sequence has redundancy, a compressed version of the sequence can be transmitted. The redundant portions of the binary sequence can be recovered at the destination with no loss in data; this approach is referred to as xe2x80x9clossless compressionxe2x80x9d.
When processing lossless compression, the amount of compression achieved is limited by the redundancy in the sequence. For large sequences such as those derived from images, the compressed version is often too large to store or transmit. To increase the compression ratio, errors may be introduced intentionally to increase the redundancy in the sequence. The destination must be willing to tolerate the distorted sequence.
Given a source sequence drawn from a source alphabet, the problem is to find a distorted sequence, of the same length as the source sequence, that is drawn from a reproduction alphabet, such that the latter sequence is highly compressible. The resulting distorted sequence is subject to the constraint that it does not differ from the former sequence by more than a certain specified distortion level, D.
The problem of lossy data compression at a fixed distortion level is summarized as follows. Given a finite string X1n=X1,X2, . . . ,Xn of length n drawn from a finite source alphabet B, a distorted or lossy version of X1n is desired, for example Y1n=Y1,Y2, . . . ,Yn, that is drawn from a finite reproduction alphabet {circumflex over (B)}. The average single-letter distortion between the two strings is at most D (according to some bounded, non-negative distortion measure d) and the lossy sequence Y1n is highly compressible.
In addition, the transmission rate must be minimized, where the transmission rate is the rate at which the lossy sequence Y1n must be transmitted subject to the distortion constraint. The central result of rate-distortion theory is that for source sequences generated by a stationary, ergodic stochastic process, asymptotically (as nxe2x86x92∞), the rate-distortion function R(D) is an achievable lower bound on the compression rate. The main problem in lossy coding is finding practical codes that asymptotically achieve the rate-distortion bound.
A code is asymptotically optimal if it achieves the rate-distortion bound as nxe2x86x92∞. A code is universal if it is asymptotically optimal for a class of sources without any a priori knowledge of which specific source in the class generated the given source sequence that is to be compressed. Roughly, a code is sequential or online if its encoding delay is O(n), where n is the length of the source string compressed so far.
Sequential codes have many practical applications, especially in streaming multi-media. A sequential code is adaptive if no codebook (or other information) must be transmitted separately by the encoder to the decoder. In other terms, an adaptive code builds its codebook xe2x80x9con the flyxe2x80x9d in response to an observed source sequence. Moreover, both the encoder and decoder can keep updating their codebooks by the same rule. Also of interest in solving the lossy data compression problem are polynomial-time algorithms with computational complexity of the form O(nk) for some k.
The goal of lossy source coding is to find a universal (for stationary, ergodic sources), sequential, adaptive, and polynomial-time algorithm. The quest for such an algorithm is important in theory as well in practice. When no distortion is desired, that is, D=0, the lossy coding problem simplifies to the problem of lossless data compression. In this case, the rate-distortion function coincides with the entropy rate of the source.
Various well-known algorithms for lossless coding are dynamic Huffman coding, arithmetic coding, Lempel-Ziv algorithms, and locally adaptive schemes. In particular, these algorithms present universal (for stationary, ergodic sources), sequential, adaptive, and polynomial-time algorithms for lossless data compression. These algorithms and their variants have had a significant practical and theoretical influence on the field of lossless data compression.
In contrast to lossless data compression, when the maximum average single-letter distortion between the two strings is such that specified distortion level, D is greater than zero (D greater than 0), no algorithm attaining all the desiderata has been developed. Low computational complexity has been achieved only at the expense of yielding a non-optimal distortion. In addition, all universal lossy coding schemes found thus far lack the relative simplicity that imbues Lempel-Ziv coders and arithmetic coders with economic viability.
Exemplary lossy coding schemes will now be reviewed. Move-to-front algorithms have been extended to lossy source coding but give no performance guarantees. An on-line lossy coding algorithm has been proposed for the fixed-rate case that is universal for memoryless sources. Similar to locally adaptive schemes for lossless compression, this approach uses a xe2x80x9cgold-washingxe2x80x9d mechanism for promoting frequently used code words to permanent status (during a time interval of specified length) while randomly generating new candidate code words. Variations of this method have been shown to be universal for certain phi-mixing sources.
Other lossy coding schemes focus on lossy extensions of the Lempel-Ziv algorithm. In these schemes, the central idea is to use approximate string matching instead of exact string matching used in the Lempel-Ziv algorithms. The Lempel-Ziv algorithm has been extended, but without performance guarantees. In addition, the fixed-database version of the Lempel-Ziv algorithm has been considered but only sub-optimal performance guarantees could be obtained.
The mismatch between the distribution generating and fixed database (the so-called training sequence) and the optimal reproduction distribution causes the fixed-database version of the Lempel-Ziv algorithms to be sub-optimal. In response, a Lempel-Ziv-type scheme has been devised such that instead of a single fixed database, multiple databases (each drawn according to a different reproduction distribution) are used at the encoder and must also be known to the decoder. The major limitation of this algorithm is that when the reproduction alphabet is large, the number of training databases is unreasonably large.
A natural type selection scheme has been proposed using string matching for finding the type of the optimal reproduction distribution. This procedure can be thought of as a stochastic simulation of the Arimoto-Blahut algorithm for computing the rate-distortion function. A Lempel-Ziv-type algorithm for lossy compression has also been proposed. From the multiple code words that may match a source word, this algorithm chooses one xe2x80x9cat random.xe2x80x9d
Along a different line, Lempel-Ziv-type codes that are universal (for stationary, ergodic sources and for individual sequences), but not sequential, adaptive, or polynomial-time have also been proposed. A fixed-slope universal lossy coding scheme has also been devised that searches for the reproduction sequence through a trellis in a fashion reminiscent of the Viterbi algorithm. The trellis structure allows computationally efficient heuristic codes that are also sequential in nature. However, finding the optimal reproduction sequence still takes exponential-time, rendering this approach computationally intractable.
Therefore, what is needed is a system and associated method that finds a highly compressible distorted sequence with an average per-letter distortion that does not exceed the specified distortion level D. The system and method should be computationally tractable with a high level of performance guarantee. The need for such system and method has heretofore remained unsatisfied.
The present invention satisfies this need, and presents a system and associated method (collectively referred to herein as xe2x80x9cthe systemxe2x80x9d or xe2x80x9cthe present systemxe2x80x9d) for polynomial-time, sequential, adaptive algorithm for lossy data compression. The present system outputs a distorted sequence that can be naturally losslessly compressed using, for example, the Lempel-Ziv (LZ78) algorithm, or any variation thereof.
Additionally, the present system offers various refinements and extensions of the Lempel-Ziv algorithm. The present system carries out a judicious minimization procedure to select the code word of the most appropriate type, using approximate string matching.
The present system sequentially parses the source sequence into phrases and then iteratively maps each source phrase to a distorted phrase until the source sequence is exhausted. The mapping is performed such that the average per-letter distortion between the two phrases does not exceed the desired distortion. In addition, the present system adaptively maintains a codebook, or dictionary, as the collection of all one-letter extensions of all previously emitted distorted phrases.
A sequential alternative minimization procedure is implemented by iterating between the following steps:
given the current codebook, find the longest source phrase that matches at least one of the phrases in the codebook subject to the distortion constraint; and
choose from all code words that match the source phrase, the code word most likely to be useful in the future.
An aspect of the present system is based on the observation that in the first step typically more than one phrase from the codebook will match the source phrase. From these competing phrases, the present system selects, in the second step, the distorted phrase that creates the least typical source phase when the source phrase is conditioned on the distorted phrase.
The present system offers numerous advantages, among which are the following: it is quadratic-time in the length of the source sequence; it is completely sequential or online; and it is adaptive in that it generates and continually grows the codebook xe2x80x9con-the-flyxe2x80x9d in response to the observed source sequence without any a priori knowledge of the source. To implement this computation, the present system relies on the Lempel-Ziv algorithm for compression with side information. Such judicious selection is intended to iteratively improve the codebook quality, resulting in an algorithm that is relatively simple, elegant, and natural.