1. Field of the Invention
The present invention relates to signal processing, including signal analysis, synthesis, approximation and compression. More particularly, the present invention relates to methods for efficient and precise data compression including image and video data.
2. Description of the Related Art
Wavelets are uniquely suited to analyzing and compactly representing signals by a computer where the signals represent a physical phenomenon such as a video image, audio signal and so on. By efficiently and effectively dividing the signal data into components corresponding to different frequency ranges, near-optimal time-domain information is retained as appropriate to that frequency range (e.g. low frequency signals are by definition not isolated well in time). In contrast, Fourier Transform based signal analysis methods convert all of these frequency ranges into a frequency spectrum, destroying all time-related information. (Time and frequency are general variables, and may, for example, be substituted with distance and spatial-frequency).
Signals that contain a temporally well-isolated high frequency spike or edge are well suited for compact representation through wavelet analysis. By applying successive wavelet decompositions, low frequency signal features may also be compactly represented, leaving large numbers of adjacent coefficients of negligible magnitude. Wavelet analysis thus provides a general purpose approach for compactly representing both low and high frequency features.
For an input signal vector f, wavelet decomposition or more generally, a sub-band decomposition, is achieved by multiplication with two decimating matrices A, B (i.e., matrices that reduce the length of such vectors by half (1/2)) expressed as follows: EQU c=Af (1) EQU d=Bf (2)
where vector c is generally referred to as the approximation or "shape" of the original signal, and where vector d is referred to as the error or "detail" signal as it generally provides a critically downsampled vector containing the difference between the original and the shape vectors. If the length of the original vector f is N, then the dimensions of each of the decimating matrices A and B are (N/2.times.N). Thus, the shape vector c and the detail vector d are each of length N/2.
Reconstruction of the original signal is achieved by multiplying the shape vector c and the detail vector d by two restoring matrices P and Q, which satisfy the following expression: EQU Pc+Qd=f (3)
By substituting for the shape vector c and the detail vector d, expression (3) can be reduced to the following equation: EQU PA+QB=1 (4)
Where 1 is an identity matrix.
An alternative approach disclosed in E. Adelson & E Simoncelli, Subband Image coding with Three-tap Pyramids, Picture Coding Symposium 1990, MIT Media Laboratory, Cambridge. Mass., interleaves both the decimating matrices A, B and the resulting shape and detail vectors c, d. Accordingly, given a decomposition vector p, the following expression is achieved for the reconstruction of the original vector f: EQU f=Fp (5)
where matrix F is an interleaved combination of matrices A and B, and the decomposition vector p is the corresponding interleaved combination of the shape and detail vectors c, d. According to the Adelson & Simoncelli approach, the interleaved combination matrix F is selected such that the following vector filters a, b alternate in the columns with offsets that place them on the diagonal of the matrix: EQU a=[1 2 1] EQU b=[0-1 2-1] (6)
It is to be appreciated that these filters a and b are orthogonal such that there is no overlap or redundancy in the resulting shape vector c and the detail vector d.
According to the Adelson & Simoncelli approach, a reconstructing matrix G is determined by a simple inversion of the system as illustrated by the following expressions. EQU GF=1 (7)
which can be alternatively expressed as: EQU G=F.sup.-1 (8)
where the values for the reconstruction matrix G can be determined for the various filter lengths. Referring back to Equation (5), the decomposition vector p can be alternatively be expressed as follows: EQU p=F.sup.-1 f (9)
One drawback to this approach is that the reconstruction vector G, which, according to equation (8), is the inverse of the interleaved combination matrix F, is generally not sparse; that is, it is dense with non-zero coefficients which also generally require floating point representation in a computer. Floating point operations, in general, may take twice as long for a computer to perform as compared to integer operations. Furthermore, where floating point operations are used, either additional processing by the computer is necessary to convert to integer, and thus requires further processing cycles of the computer and the addition of more noise, or alternatively, a separate, larger array must be created to store the results of the decomposition process in floating point form because the floating point numbers require more bits in storage.
The Adelson & Simoncelli approach approximates the reconstruction matrix G with between 15 to 21 non-zero coefficients along the diagonal represented in floating point precision. The drawbacks to this approach are its use of 1) floating point as opposed to integers, 2) approximation, and therefore, loss of precision compared to a more precise and controlled approximation, and 3) the need for 15 to 21 coefficients rather than 3 coefficients since, for efficient and precise operations, three integer computations is much more desirable than 15 to 21 floating point operations.
Digital Data Compression System Including Zerotree Coefficient Coding, U.S. Pat. No. 5,315,670 to Shapiro discloses a conventional image compression technique where the results of the decomposition pass are floating point coefficients. In such a case, however, if the floating point coefficients are used in completing the compression, a very large number of floating point compares and adds are required. For example, the number of compares and adds/subtracts in the Shapiro method is on the order of 1 per bit. Threrefore, given a 24-bit color image of 512.times.512 pixels, the Shapiro method would require over 6 million compares and adds. Furthermore, the precision of the coefficients in Shapiro is significantly compromised by approximating the coefficients as integers. While the Shapiro approach may be adequate in some cases, its inherent encoding benefit where successive refinements from the coefficients are used to reproduce the original sub-band coefficient matrix precisely is lost. Moreover, even this approach requires some large number of floating point conversions.