In order to transmit data over channels with limited throughput rates or to store data in limited memory space, it is frequently necessary to compress the data so as to represent the data with fewer bits. Upon receiving or retrieving the compressed data, the data may be decompressed to recover the original data or an approximation thereof.
Compression and decompression techniques are commonly applied to signal, image, and video data in order to reduce otherwise massive transmission and storage requirements. By way of example, a single monochrome image is typically formed by an array of pixels, such as a 512×512 array of pixels. In addition, the intensity level of each pixel is generally assigned a numeric value between 0 and 255, which is digitally represented by an 8-bit pattern. Therefore, the digital representation of such a monochrome image requires approximately 2 million bits of data. As a further example, a typical digital color video format is the Common Intermediate Format (CIF) having a resolution of 360×288. This color video format includes three color components, which are each represented as an array of pixels that are displayed at a rate of 30 frames per second. The three color components are an intensity component at full resolution (360×288) and two chrominance components at half resolution (180×144) each. Thus, the total throughput requirement for CIF is about 37 million bits per second. Thus, the transmission of uncompressed digital imagery requires relatively high throughput rates or, alternatively, relatively long transmission times. Likewise, the storage of digital imagery requires relatively large memory or storage devices.
In order to reduce the storage and transmission requirements for signal, image, and video processing applications, a variety of compression techniques have been developed. Substantial compression is possible due to the statistical redundancy typically found in signal, image, and video data. Such redundancy takes several forms, namely, spatial redundancy due to correlations between pixels within an image or video frame as a function of relative spatial position, temporal redundancy due to correlation between successive or temporally proximate frames in a video sequence or between neighboring signal samples, and spectral redundancy between the color planes or bands of multispectral images. By way of example, this discussion will primarily address the subject of image compression, but the approaches discussed apply to the compression of signal and video data as well.
Image compression techniques attempt to reduce the volume of data by removing these redundancies. These compression techniques may be classified into two broad categories: lossless and lossy. With lossless image compression, a reconstructed image is guaranteed to be numerically identical to the original image. Unfortunately, lossless approaches can provide only a very limited amount of compression (typically less than 3:1). In contrast, lossy techniques achieve higher compression ratios (or lower bit rates) by allowing some of the visual information to be removed, thereby resulting in a difference or distortion between the original image and the reconstructed image. (The bit rate is reciprocally related to the compression ratio.) In general, higher compression ratios (lower bit rates) can be achieved by tolerating higher amounts of distortion. Distortion is commonly measured in terms of mean squared error (MSE), i.e., the mean squared pixel intensity differences between the original and reconstructed images, or conversely by measures of image quality. Root mean squared error (RMS error or RMSE), which is the square root of mean squared error, is also commonly used as an error measure. Common measures of image quality include Peak Signal to Noise Ratio (PSNR), which is closely related to RMS error, or Structural Similarity (SSIM), which measures changes in the “structural information” perceived by the human visual system. Lossy compression approaches seek to maximize the amount of compression while minimizing the resulting distortion (or maximizing the quality of the reconstructed image). If the distortion from a lossy compression process is not visually apparent under normal viewing conditions, the process is commonly called visually lossless compression.
The performance of a signal, image, or video compression algorithm may be characterized by a rate-distortion curve, which plots the distortion measure as a function of the bit rate for the compressed representation. Thus, the objective of a compression algorithm is to minimize the bit rate subject to a constraint on distortion, or conversely to minimize the distortion subject to a constraint on bit rate.
A common approach to image compression, called transform-based compression or transform coding, involves three primary steps, namely, a transform step, a quantization step, and an encoding step. As described in U.S. Pat. No. 5,014,134 to Wayne M. Lawton, et al. and U.S. Pat. No. 4,817,182 to Adelson, et al., an invertible transform decomposes the original image data into a weighted sum of simple building blocks, called basis functions, such as sinusoids or wavelet functions. Accordingly, a number of image transforms have been developed, including the Fourier transform, the discrete cosine transform and the wavelet transform. If the basis functions have sufficient correspondence to the correlation structure of the imagery to be compressed, most of the energy (or information) in the image will be concentrated into relatively few of the transform coefficients with correspondingly large coefficient values. Consequently, the preponderance of the transform coefficients will have small or zero coefficient values.
The wavelet transform decorrelates the image data at multiple resolutions by use of basis functions which are dilations and translations of a single prototype function. The prototype basis function is a bandpass filter called a wavelet, so named because the filter is both oscillatory and spatially localized. The translations and dilations of the prototype wavelet yield a set of basis functions which produce a signal or image decomposition localized in position and resolution, respectively.
The wavelet transform can be computed using a fast discrete algorithm, called the Fast Wavelet Transform (FWT), which recursively applies the wavelet filter and a companion lowpass filter called a scaling filter. For a single iteration of the FWT applied to a one-dimensional signal, the wavelet and scaling filters are convolved against the signal, followed by decimation by two. This process splits the signal into a low resolution approximation signal (extracted by the scaling filter) and a high resolution detail signal (extracted by the wavelet filter). By recursively applying the wavelet filter and the scaling filter to the low resolution approximation signal generated by the prior iteration of the FWT, a multiresolution decomposition of the original signal is produced which consists of the detail signals at various resolutions and a final low resolution approximation signal.
The wavelet transform can be easily extended to two-dimensional imagery by separately filtering the rows and columns and by iteratively processing the lowpass approximation image. This wavelet transform is equivalent to decomposing the image in terms of basis functions which are 2-D tensor products of the 1-D wavelet and scaling filters. See, for example, U.S. Pat. Nos. 5,014,134 and 4,817,182, the contents of which are expressly incorporated by reference herein. See also Oliver Rioul, et al., “Wavelets and Signal Processing”, IEEE Signal Processing Magazine, vol. 38, no. 4, pp. 14-38 (October 1991); Bjorn Jawerth, et al., “An Overview of Wavelet-Based Multi-Resolution Analyses”, SIAM Review, Vol. 36, No. 3, pp. 377-412 (1994); and Michael L. Hilton, et al., “Compressing Still and Moving Images with Wavelets”, Multimedia Systems, Vol. 2, No. 3 (1994) for further descriptions of the wavelet transform.
Once the image data has been transformed, the compression algorithm then proceeds to quantize and encode the transform coefficients which are generated by the wavelet transform. The quantization step discards some of the image content by approximating the coefficient values. Quantization generally involves a mapping from many (or a continuum) of input values to a smaller, finite number of output levels. The quantization step divides the range of input values by a set of thresholds {ti, i=0, . . . N-1} and maps an input value falling within the interval (ti, ti+1] to the output value represented by the discrete symbol or quantization index i. Correspondingly, dequantization (used to recover approximate coefficient values during decompression) maps the quantization index i to a reconstructed value ri which lies in the same interval, i.e., (ti, ti+1]. For minimum RMS error, the reconstructed value should correspond to the mean of those coefficient values falling within the interval, but, in practice, a reconstruction value at the center of the interval is often used. Further, scalar quantization maps a single scalar value to a single discrete variable, whereas vector quantization jointly maps a plurality (or vector) of M values to each discrete variable.
While the quantized coefficient values have reduced precision, they also can be represented with fewer bits, thus allowing higher compression at the expense of distortion in the reconstructed image. This image distortion is referred to as quantization error and accounts for all of the distortion inherent in lossy compression schemes. Thus, the quantization step is omitted for lossless compression approaches.
A variety of factors contribute to the choice of the actual quantization intervals, such as the desired compression ratio, the statistical distribution of the coefficient values, the manner in which the quantized coefficient values will be encoded, and the distortion metric used to measure image degradation. When the quantized coefficients will be entropy-coded, RMS error can be (approximately) minimized by using uniform quantization intervals. See R. C. Wood, “On Optimum Quantization”, IEEE Transactions on Information Theory, Vol. 15, pp. 248-52 (1969). In the absence of entropy coding, the RMS error is minimized by choosing nonuniform quantization intervals in accordance with the Lloyd-Max algorithm as described in S. P. Lloyd, “Least Squares Quantization in PCM”, Bell Lab. Memo. (July 1957), reprinted in IEEE Transactions on Information Theory, Vol. 28, pp. 129-37 (1982), and also in J. Max, “Quantizing for Minimum Distortion”, IRE Transactions on Information Theory, Vol. 6, pp. 7-12 (1960).
Due to the decorrelating properties of the wavelet transform, the distribution of transform coefficient values is typically sharply peaked at zero. Similar decorrelating effects may be commonly achieved by alternate techniques such as the block discrete cosine transform or predictive coding. This type of peaked coefficient distribution results in a preponderance of coefficients falling into the quantization interval at the origin, i.e., the quantization interval centered on the value of zero. Due to the preponderance of coefficients near zero, more efficient compression performance can be achieved by treating the quantization interval at the origin separately. In particular, the overall coding efficiency may be increased by using a larger quantization interval around the origin, often called a dead zone. In one preferred embodiment, the dead zone interval is twice as large as the adjacent intervals. The dead zone is centered about the origin with a reconstruction value exactly equal to zero to prevent artifacts resulting from the use of nonzero reconstruction values for the many coefficients close to zero. The magnitude of the positive and negative bounds of the dead zone is termed the clipping threshold because all coefficients whose magnitudes fall below this threshold are “clipped” to zero. In addition, those coefficients whose magnitudes exceed the clipping threshold are termed significant coefficients, while those coefficients whose values lie below the threshold (i.e., within the dead zone) are termed insignificant coefficients.
Dead zone quantization may be simply implemented by applying the quantization process to the magnitudes (or absolute values) of the coefficients, and discarding the sign information for the insignificant coefficients (i.e., with magnitudes less than the clipping threshold). This strategy used in combination with uniform quantization results in a dead zone which is approximately twice as large as the other quantization intervals.
Because most of the coefficients produced by the transform have small magnitudes or are equal to zero, the quantization process typically results in the majority of the coefficients being deemed insignificant, while only relatively few of the quantized coefficients have magnitudes exceeding the clipping threshold which are deemed significant. Thus, as indicated above, it is advantageous to treat the significant coefficients separately from the insignificant coefficients.
This separate treatment may be accomplished by separately indicating the positions and the quantized values of the significant coefficients. In addition, the positions of the significant coefficients can be represented using one of a variety of conventional approaches, such as tree structures, coefficient maps, or run length coding. For example, the positions of the significant coefficients may be represented by means of run lengths of consecutively occurring insignificant coefficients or by alternating run lengths of consecutive significant coefficients and consecutive insignificant coefficients.
To achieve further compression, the quantized values and the position information for the significant coefficients may both be coded using any of a variety of entropy coding approaches, which are designed to encode data sources with reduced size or volume, while allowing the original sequence of data to be recovered exactly. The entropy of a data source is a mathematical measure of the information in the source as a function of the source statistics. Entropy is commonly defined in units of bits per source symbol. The entropy of a particular source defines the lower bound or the theoretical minimum bit rate at which the source can be represented. The objective of entropy coding is to minimize the bit rate for the compressed representation, with the entropy of the source defining the ultimate limit of that compression performance. The amount by which the bit rate for the compressed representation exceeds the entropy of the source is called the redundancy, and is often defined in terms of a percentage.
Entropy coding may reduce the number of bits required to represent a data set by using variable length coding in a manner which exploits the statistical probabilities of various symbols in the data set. For example, entropy coding assigns shorter code words to those symbols which occur frequently, while longer code words are assigned to those symbols which occur less frequently. Entropy coding is completely reversible so that no additional distortion is introduced beyond that due to the quantization process.
Entropy coding (as well as other forms of coding) defines a mapping from an alphabet of source symbols to a set of code words constructed from a code alphabet. The mapping from source symbols to code words is defined by a codebook which must be known to both the encoder and decoder. Various coding methods are sometimes categorized based on how the codebook is determined or defined. Static codes are codes in which the codebook is fixed and known a priori to both the encoder and decoder. Static codes are appropriate when the statistics of the source sequence are known in advance and relatively stationary.
Forward adaptive codes define a codebook based on the statistics of source symbols which have not yet been coded (i.e., the codebook estimation process looks forward). Forward adaptive coding requires two passes through the sequence or block of source symbols, a first pass to collect source statistics followed by a second pass to encode the data. Forward adaptive codes require that the codebook information (or the underlying statistics) be transmitted from the encoder to the decoder as side information or overhead. Forward adaptive codes are also sometimes called off-line adaptive or block adaptive codes.
Backward adaptive codes estimate the codebook “on the fly”, based on the statistics of previously coded symbols (i.e., the codebook estimation process looks backward). Backward adaptive coding requires only a single pass through the sequence or block of source symbols, performing both the coding and the statistical estimation. Typically, the source statistics are modeled in the form of a histogram of the symbol occurrences or by one or more scalar parameters which characterize a mathematical model of the source. (A histogram is a discrete distribution consisting of individual bins representing the frequency or probability of occurrence of the associated distribution values). The statistical model and the resulting codebook may be updated after every source symbol or at less frequent intervals. The statistical model typically includes some form of “forgetting factor” to prevent overflow of accumulated statistics and to weight recent symbols more heavily. The codebook estimation is performed in synchronized fashion by both the encoder and decoder, thus eliminating the need for any explicit side information or overhead to pass the codebook from the encoder to the decoder. While backward adaptive coding requires no explicit side information, the coding is inefficient until the codebook converges to an effective approximation of the source statistics. The resulting coding inefficiencies constitute a form of “implicit side information”. An important factor in the performance of a backward adaptive code is the speed of convergence. Backward adaptive codes are also sometimes called on-line adaptive codes.
Perhaps the most well-known method of entropy coding is Huffman coding, which represents the data symbols using code words that each consist of an integer number of bits (as described in D. A. Huffman, “A Method for the Construction of Minimum Redundancy Codes,” Proceedings of the IRE, vol. 40, no. 10, pp. 1098-1101, (1952)). Due to the constraint of an integer code word length, Huffman coding cannot in general achieve a bit rate equal to the entropy of the source. However, Huffman coding is optimal in the sense that it minimizes the bit rate subject to the constraint of integer code word length. Hence, Huffman coding is an example of what are known as minimum redundancy codes. The minimum redundancy property for Huffman codes rests on the observation that the two least probable symbols must have the same code word length (or else it would be possible to reduce the length of the longer code word). Based on this observation, the Huffman coding procedure forms a codebook starting from the two least probable symbols and works towards more probable symbols.
Another popular entropy coding approach is arithmetic coding, which is capable of producing code words whose length is a fractional number of bits (as described in I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic Coding for Data Compression,” Communications of the ACM, vol. 30, no. 6, pp.520-540, (June 1987)). In arithmetic coding, the entire bitstream is essentially a single large codeword that represents the entire message, which makes it feasible to use a fractional number of bits to represent specific source symbols. Because arithmetic coding is not constrained to integer codeword lengths, it can achieve bit rates which are only negligibly larger than the entropy. However, arithmetic coding typically requires relatively more computation than alternative approaches.
Parametric entropy coding provides a simplified approach to entropy coding. While traditional entropy coders estimate a codebook based on the raw empirical statistics of the source (typically represented in a histogram), parametric entropy coders use an approximation of the source statistics based on an appropriate mathematical or statistical distribution function. For example, the exponential distributions and Laplacian (two-sided exponential) distributions have proven particularly useful for modeling data sets that arise during signal, image, and video compression. With the parametric approach, the codebook is determined by one to several parameters, which may be either parameters which characterize the distribution model or parameters of the entropy coder itself which in turn may impute an appropriate distribution function. In addition to computational simplicity, parametric entropy coding provides a simplified model for the source, which thus may reduce the amount of side information for forward adaptive coders or provide faster adaptation to the source statistics for backward adaptive coders. A simplified source model may be particularly beneficial when the source statistics are highly nonstationary (i.e., when the statistical characteristics of the source change rapidly). Two popular examples of parametric coding are Golomb coding (described in S. W. Golomb, “Run-Length Encodings,” IEEE Transactions on Information Theory, vol. 12, no. 3, pp. 399-401, (July 1966)) and Rice coding (described in R. F. Rice, “Some Practical Universal Noiseless Coding Techniques,” JPL Publication 79-22, Jet Propulsion Laboratory, Pasadena, Calif. (March 15, 1979)), which is a constrained form of Golomb coding.
A popular variation of transform coding is embedded coding. See, for example, U.S. Pat. No. 5,315,670 to Jerome M. Shapiro. See also Jerome M. Shapiro, “Embedded Image Coding Using Zerotrees of Wavelet Coefficients,” IEEE Transactions on Signal Processing, vol. 41, pp. 3445-3462 (1993); A. Zandi, J. D. Allen, E. L. Schwartz, and M. Boliek, “CREW: Compression with Reversible Embedded Wavelets,” Preprint, Ricoh California Research Center, Menlo Park, Calif. (1995); and Amir Said and William A. Pearlman, “A new, fast, and efficient image codec based on set partitioning in hierarchical trees,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 6, No. 3, pp. 243 (June 1996) for descriptions of various embedded coding approaches.
Commonly in embedded coding, a decorrelating transformation, such as the wavelet transform or block discrete cosine transform, is first applied to the signal or image data (or to each image plane of video, color, or multispectral data), as described above. The coefficients within the resulting subbands or frequency bands are then encoded one bitplane at a time, starting from the most significant bitplane of the coefficient magnitudes and continuing to less significant bitplanes. Commonly, the subbands or frequency bands (for each image plane) are spatially partitioned (to support improved scalability and random access on a spatial basis). The spatial partitions of the subbands are sometimes referred to as precincts, and the precincts may commonly be subdivided into even smaller spatial subdivisions sometimes called codeblocks. Each bitplane of each codeblock is encoded independently, with causal dependence only upon the more significant bitplanes of the same codeblock. In some variations of embedded coding, this dependence extends also to “parent coefficients” at coarser resolution subbands (where parent coefficients are coarser resolution coefficients at the corresponding spatial position).
The encoding of each bitplane is commonly performed in multiple passes. The first pass in each bitplane encodes information about which coefficients are “newly significant” for the current bitplane. (Newly significant coefficients are coefficients whose magnitudes are insignificant (zero) at more significant bitplanes, but which become significant (non-zero) when the current bitplane is included.) The sign for each newly significant coefficient is also encoded. The second pass in each bitplane encodes refinement information for coefficients which were previously significant (i.e., the magnitude bit within the current bitplane for coefficients which were significant at the more significant bitplanes). Some implementations subdivide the encoding of refinement information into two passes in order to provide finer granularity on the encoded data. The entropy coding method used in embedded coding is commonly a context-adaptive binary arithmetic coder, although sometimes other coding methods are used. Commonly, the binary arithmetic coder encodes the significance or non-significance for each previously insignificant coefficient, as well as the sign bit for each newly significant coefficient, and finally the refinement bit for each previously significant coefficient. For context adaptivity, the arithmetic coder makes use of the already encoded information from nearby coefficients in the same codeblock (or from coarser resolutions when dependence extends to parent coefficients).
The sequential coding of the coefficient bitplanes, starting from the most significant bitplane and proceeding to less significant bitplanes, provides a mechanism for successive approximation of the coefficient values wherein the number of bitplanes included in the coding determines the fidelity of the resulting approximation. From this perspective, the embedded coding process can be seen to implement a progressive uniform quantizer with a dead zone approximately twice as large as the nominal quantization interval. Accordingly, embedded coding is sometimes referred to as embedded quantization. If the transform is a reversible transform and the bitplane coding proceeds to “completion” (i.e., to the coding of the least significant bitplane for all subbands), then lossless compression will be supported.
The encoded data from each bitplane pass or from a set of consecutive passes may be aggregated to form a bitstream packet. When limited causality or dependence is used in the bitplane coding, then the bitstream packets generated by embedded coding may be flexibly sequenced to achieve a variety of progressive coding effects. Progressive coding refers to the sequencing or ordering of encoded content within a compressed bitstream in accordance with a prescribed priority scheme, in order to provide more favorable treatment to higher priority content.
The prioritization of content for progressive coding may be based on a variety of considerations including resolution, spatial position, image quality, spectral or color band, region of interest designation, or combinations thereof. For the purposes of this discussion, we shall refer to the considerations by which a progression is prioritized as progression aspects. For example, when the packets are ordered according to the progression aspect of distortion in the reconstructed image or signal, then a “quality progression” is obtained. The choice of progression priorities is typically set by the user or fixed for the application. Packet ordering is constrained by the causality of the coding, i.e., no packet may precede another packet on which it is dependent.
An example of progressive coding options may be found in the JPEG 2000 standard (specified in ISO/IEC 15444-1:2002, “Information technology—JPEG 2000 image coding system—Part 1: Core coding system,” (January 2001)), which defines progression priority based on such progression aspects as quality layer (L), resolution (R), position (P), and color or spectral component (C). Thus an LRCP priority orders the encoded blocks first according to quality layer (PSNR), followed by resolution, component, and finally position. The RLCP priority orders the encoded blocks first according to resolution, followed by quality layer, component, and finally position. The five progression sequences supported in JPEG 2000 are LRCP, RLCP, RPCL, PCRL, and CPRL.
The ability to dynamically truncate a progressively coded bitstream is commonly referred to as scalability, such as resolution scalability, spatial scalability, or quality scalability depending upon the progression priority of the bitstream. The granularity of scalability refers to the resolution or size at which the scalable bitstream may be truncated. If the bitstream may be truncated at a very fine resolution, such as at the level of bits, bytes, or very small packets, then the scalability is called fine grain scalability.
Progressively encoded data is often generated in two processing stages: (1) a coding stage in which the data is actually encoded, and (2) a sequencing stage in which the encoded content is ordered or sequenced in accordance with the defined progression priority. (In JPEG 2000, these two stages are referred to as Tier 1 and Tier 2 processing, which terminology will be retained herein.) For decoding, the tier 1 and tier 2 processing stages are reversed in order. The order in which the data is generated during tier 1 processing is typically determined based on considerations which minimize memory and computational requirements or in accordance in the order in which the data may be provided from the source. The tier 2 processing typically has comparatively low computational cost relative to tier 1.
When progressively coded data is transmitted over a channel, the process is commonly referred to as progressive transmission. In this case, the transmission is progressively received (with content ordered in accordance with the progression priority) until it is completed or until it is terminated by events such as packet loss, timeout, or user action. With progressive transmission, it is possible to incrementally update the display as the content is received. This approach is commonly used within web browsers to display progressively received image content.
Favorable treatment of high priority content in progressive coding is achieved by placement earlier in the bitstream. Because later positions in the bitstream are subject to greater exposure to a wide variety of error effects, earlier placement within the bitstream offers a very simple mechanism for providing more protection to higher priority content. Some examples of effects which expose later portions of the bitstream to a higher incidence of errors include termination of progressive transmission, truncation of a bitstream to meet bit rate targets, the use of Droptail buffering policies in QoS protocols, and various other limitations on storage and transmission channel bandwidth. In some applications, more elaborate protection schemes may be used to provide even higher levels of protection for high priority content. One common approach is to use varying levels of error correction coding (ECC) according to the priority of the content. Another approach is to transmit different blocks using different channels of a quadrature amplitude modulation (QAM) constellation, with the different channels inherently subject to differing signal to noise ratios. Both of these approaches are commonly called unequal error protection (UEP). Still a third approach is to apply different strategies for block retransmission in an Automatic Repeat-reQuest (ARQ) error control protocol, so that higher priority blocks are retransmitted upon errors, while lower priority blocks are not.
With embedded coding, the resulting bitstream may be truncated (by dropping packets from the end of the bitstream) to meet a target bit rate (i.e., compression ratio). (For some implementations, the truncation may operate at granularities finer than the individual packet level, down to individual bytes or bits, providing fine grain scalability.) The name embedded coding derives from the fact that a compressed bitstream contains all compressed bitstreams of higher compression ratio “embedded” within it. By truncation of the embedded bitstream, the compression ratio or bit rate may be dynamically varied. Such truncation results in requantization of the affected coefficients, resulting in increased distortion in the reconstructed image. When the packets are aggregated at the bitplane level (or a finer level) and are ordered in a quality progression, then the truncation of the resulting bitstream achieves rate-distortion performance which is nearly optimal, subject to the constraints of the transform and encoding methods used to produce the bitstream.
Embedded coding thus provides many desirable properties, including variable rate control from a single compressed image file, support for a variety of progressive coding modes, and suitability for use with schemes which provide more favorable treatment or protection to higher priority content. The primary drawback of embedded coding is that it is computationally expensive, due to the fact that the encoding of the bitplanes requires repeated analysis passes through the coefficients. The context adaptivity of the entropy coding can also add considerably to this complexity. The decoding is similarly expensive with decoding of the bitplanes requiring repeated conditional refinement of the coefficients. Thus, the computational cost of embedded coding has limited its use in many applications.
In addition to the embedded coding methods described above, an alternative approach offers reduced computation by adaptively varying the granularity of the embedded coding, as described in C. D. Creusere, “Fast Embedded Compression for Video,” IEEE Transactions on Image Processing, vol. 8, pp. 1811-1816 (December 1999). The granularity of the embedding may range from coarse embedding, for which each quality layer is generated from two or more bitplanes encoded during a single pass, to fine embedding, for which a quality layer (possibly subdivided into significance and refinement layers) is generated from a single bitplane encoded during a single pass. With this approach, multiple bitplanes may be encoded during each pass through the coefficients, thus reducing the number of passes. Since truncation of the bitstream operates first on the final quality layer, the granularity of the embedding typically includes only a single bit plane (the least significant bitplane after any quantization) in the final embedded layer. If the truncation of the embedded bitstream is limited to this final (single bit plane) layer, then the rate-distortion performance is not degraded in comparison to a fully embedded bitstream.
It is thus desirable to provide improved techniques for encoding data which provide some of the progressive and layered coding properties of embedding coding without requiring repeated passes through the coefficients for the encoding/decoding of individual bitplanes. If such a method can be achieved as a natural byproduct of the entropy coding method, then many of the benefits of embedded coding can be obtained with significant computational savings and a simpler implementation.