This invention relates to techniques for data compression to realize more efficient data storage and transmission. More particularly, the invention is a family of processes which comprise a generalization of set partitioning in hierarchical trees for image compression that is computationally less complex and requires less memory overhead.
Wavelet-based compression of images for storage and transmission involving hierarchical subband decomposition is an attractive way for achieving needed computational and coding efficiency. A desirable property of compression schemes is that if the compressed bit stream is truncated at an arbitrary point, the bit stream contains a lower rate representation of the image. Consequently, bits are transmitted in the order of importance. This embedded property can be applied, for example, to budget a fixed number of bits per frame, such as in constant frame-rate synchronous bit rate video communications. It further allows coding to be terminated when a given distortion metric is met, as in archiving images.
One of the prior art approaches for compressing the wavelet coefficient array by exploiting its statistical properties, is a process known as set partitioning in hierarchical trees (hereinafter, xe2x80x9cSPIHTxe2x80x9d). The process is described in U.S. Pat. No. 5,764,807 to Pearlman et. al., issued June, 1998. SPIHT in turn is a refinement of a process known as Embedded Zerotree Wavelet (xe2x80x9cEZWxe2x80x9d) which is described in U.S. Pat. No. 5,315,670 to Shapiro, issued June 1994. These patents to the extent relevant are hereby incorporated by reference. By structuring the data in hierarchical tree sets that are likely to be highly correlated, the EZW process exploits the self-similarity of the wavelet transform at different scales. Pearlman et. al. teaches the further partitioning of the data into lists, to gain additional compression.
More specifically, the scheme of U.S. Pat. No. 5,764,807 requires partitioning the wavelet coefficients into a number of lists, with list membership changing as the execution proceeds, and in some cases involving dual membership of a coefficient in different lists. The constantly changing list memberships are difficult and expensive to maintain. Direct implementation of the process unavoidably requires an inner loop of execution that involves the steps xe2x80x9cread value and list membership information, compute significance information, derive and output encoding bits, and modify and save new list membership informationxe2x80x9d. The memory to preserve list membership information, the instructions to compute new list membership information, and I/O traffic (bandwidth) to save and retrieve that information, all contribute to process overhead and execution time.
With the above-described protocol, the execution speed of the processing on available DRAM hardware is slow. Implementing the SPIHT process in the manner described in U.S. Pat. No. 5,764,807 for HDTV-rate compression for example, requires memory access rates that are close to the limits of currently available commercial DRAM devices.
Therefore, a need exists for a compression scheme which can execute faster, thus accommodating the size and rate of available memory technology, and yet achieve the SPIHT level of performance.
It has been realized that more efficient compression may be gained by considering simply the tree structure and the significance values derived from the tree structure. In this conceptualization, lists are not needed. Bits are produced in any order desired, not in the order dictated by use of lists. Trees are traversed and bits are emitted to describe the relative magnitude of coefficients in the sub-trees of the current tree-node, in some appropriate way that is either known to the decoder or that can be derived by the decoder. Rather than accessing each Wavelet coefficient and associated parameters numerous times to generate an output, the invention teaches accessing each coefficient in a predetermined order. All bits emitted by a given node are produced as soon as that node is examined. Advantageously, the encoding itself (i.e., the bits that are emitted as well as the case of emitting no bits until a defined significance is detected) is essentially the same as that used in SPIHT.
In one embodiment, the traversal process produces relevant bits in a given tree before moving on to the next tree, in a scheme of xe2x80x9csubtrees first, then siblingsxe2x80x9d. It is more advantageous in certain circumstances to use a traversal scheme in which siblings of the current node are traversed, and thereafter the subtrees down from the current node are traversed. Alternatively, it may be advantageous to encode certain sub-trees only; or choose to refine some subset of nodes based on a variety of criteria, such as precomputed metrics or image subsets of particular interest. Traversal algorithms may be varied, e.g., pre-order and in-order. Variations on the process allow accessing the coefficients only once, or once per bit-plane. A further variation is to separate the produced bits into output queues for further reordering or other processing. For example, it may be of interest to use three output queues dependent in part on whether the current node was already emitting value v bits in the previous plane.
The fact that the magnitudes of the coefficients in the hierarchical tree data organization tend to decrease with depth is exploited by a coding scheme that uses specific definitions of significance for sets of transform coefficients. We define the significance B(v) of the coefficient associated with a node v to be the position of its most significant bit. We similarly define B(V) for a set of nodes V to be the maximum significance of all the nodes in the set. Two of the sets associated with a given node v are of particular interest: the set D1(v)of all descendants of v, and the set D2(v) of all grandchildren of v and their descendants. The associated significances B1(v)=B(D1(v)) and B2(v)=B(D2(v)) can be precomputed for all nodes in any given tree. The encoding scheme then specifies all bits emitted by a given node v as a function of the traversal of its parents (i.e., the node is reached), the bit-plane b, the coefficient value c of v, the significances B1(v) and B2(v), and whether the parent node p still emits B2(p).