1. Field of the Invention
This invention relates in general to data processing, and more particularly to improving approximations used in performance sensitive transformations which contain sub-transforms.
2. Description of the Related Art
Transforms, which take data from one domain (e.g., sampled data) to another (e.g., frequency space), are used in many signal and/or image processing applications. Such transforms are used for a variety of applications, including, but not limited to data analysis, feature identification and/or extraction, signal correlation, or data compression. Many of these transforms require efficient implementation for real-time and/or fast execution whether or not compression is used as part of the data processing.
Data compression is desirable in many data handling processes, where too much data is present for practical applications using the data. Commonly, compression is used in communication links, to reduce transmission time or required bandwidth. Similarly, compression is preferred in image storage systems, including digital printers and copiers, where “pages” of a document to be printed may be stored temporarily in memory. Here the amount of media space on which the image data is stored can be substantially reduced with compression. Generally speaking, scanned images, i.e., electronic representations of hard copy documents, are often large, and thus make desirable candidates for compression.
In data processing, data is typically represented as a sampled discrete function. The discrete representation is either made deterministically or statistically. In a deterministic representation, the point properties of the data are considered, whereas, in a statistical representation, the average properties of the data are specified. In particular examples referred to herein, the terms images and image processing will be used. However, those skilled in the art will recognize that the present invention is not meant to be limited to processing still images but is applicable to processing different data, such as audio data, scientific data, sensor data, video data, etc.
In a digital image processing system, digital image signals are formed by first dividing a two-dimensional image into a grid. Each picture element, or pixel, in the grid has associated therewith a number of visual characteristics, such as brightness and color. These characteristics are converted into numeric form. The digital image signal is then formed by assembling the numbers associated with each pixel in the image into a sequence which can be interpreted by a receiver of the digital image signal.
Signal and image processing frequently require converting the input data into transform coefficients for the purposes of analysis. Often only a quantized version of the coefficients is needed (e.g. JPEG/MPEG data compression or audio/voice compression). Many such applications need to be done fast in real time such as the generation of JPEG data for high speed printers.
Pressure is on the data signal processing industry to find the fastest method by which to most effectively and quickly perform the digital signal processing. As in the field of compression generally, research is highly active and competitive in the field of fast transform implementation. Researchers have made a wide variety of attempts to exploit the strengths of the hardware intended to implement the transforms by exploiting properties found in the transform and inverse transform.
One such technique is the ISO 10918-1 JPEG International Standard/ITU-T Recommendation T.81. The draft JPEG standard is reproduced in Pennebaker and Mitchell, JPEG Still Image Data Compression Standard, New York, Van Nostrand Reinhold, 1993, incorporated herein by reference. One image analysis method defined in the JPEG standard, as well as other emerging compression standards, is discrete cosine transform (DCT) coding. With DCT coding, images are decomposed using a forward DCT (FDCT) and reconstructed using an inverse DCT (IDCT). An excellent general reference on DCTs is Rao and Yip, “Discrete Cosine Transform: Algorithms, Advantages and Application”, New York, Academic Press, 1990, incorporated herein by reference. It will be assumed that those of ordinary skill in this art are familiar with the contents of the above-referenced books.
It is readily apparent that if still images present storage problems for computer users and others, motion picture storage problems are far more severe, because full-motion video may require up to 60 images for each second of displayed motion pictures. Therefore, motion picture compression techniques have been the subject of yet further development and standardization activity. Two important standards are ISO 11172 MPEG International Standard and ITU-T Recommendation H.261. Both of these standards rely in part on FDCT coding and IDCT decoding.
DCT is an example of a linear transform algorithm, and in such transforms it is common for floating point constants to be used in multiplication operations. However floating point multiplication operations are expensive in terms of processor computations, and consequently slow down the speed at which the transform executes. As a result in applications in which the speed of processing is important, such as in JPEG/MPEG compression, designers seek to replace these floating point multiplications with integer multiplication operations which are faster to execute. Current designs demonstrate three general approaches by which this is achieved:
“Development of Integer Cosine Transforms by the Principle of Dyadic Symmetry”, Cham, W.-K, IEE Proceedings, Vol. 136, Pt. 1, No 4, August 1989 describes replacing the floating point multiplications with multiplications done in fixed precision, i.e. approximate the floating point constant with an integer.
“Multiplierless Approximation of Transforms with Adder Constraint”, Chen, Ying-Jui, Soontorn Oraintara, Trac D. Tran, Kevin Amaratunga, Truong Q. Nguyen, IEEE Signal Processing Letters, Vol. 9, No. 11, November 2002, describes approximating the floating point constant multiplication or integer multiplication with a series of shift and add operations. In this approach, the goal is to implement the multiplication operation in terms of shift and add operations on the multiplicand.
U.S. Pat. No. 6,766,341—Fast transform using scaled terms, to IBM Corp. describes approximating the floating point constant by finding a ratio (i.e. an integer numerator and an integer denominator) in which the numerator represents the bit patterns to be used in shift/add operations (as in “Multiplierless Approximation of Transforms with Adder Constraint” above), and the denominator scales the final result to achieve the accuracy of the approximation. Note that in this case, the shifts and adds are done during transform processing, and the denominator (divide operation or multiplication by the inverse) is folded into the quantization step.
Further the strategy of factoring a transform into its subs-transforms is a known technique used to simplify the execution of a transform. For example, “Fast Multiplierless Approximations of the DCT With the Lifting Scheme”, Jie Liang, Trac D. Tran, IEEE Transactions on Signal Processing Vol. 19, No. 12, December 2001, discloses considering a DCT in terms of sub-transforms and performing the sub-transforms in lifting steps.
Also “Fast Algorithms for the Discrete W Transform and for the Discrete Fourier Transform”, Zhongde Wang, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-32, No. 4, August 1994, considers factoring a transform into its sub-transforms. These sub-transforms are matrices which are used to reduce the computation required to produce a result.
However, the problem posed by replacing floating point operations with fast approximations, and factoring transform equations into sub-transforms, is actually a multi-criteria optimization problem. Criterion one is to find approximations and sub-transforms that are quick to execute. This criterion refers to the “cost” in terms of shifts and adds. The greater the number of shift and add operations, then the greater the total cost to execute all of the operations. Criterion two (equal in import to criterion one) is to mitigate any error, in the final transform output, which result from the approximations. As demonstrated in the prior art, scientists and engineers use different approaches to finding fast transforms and good approximations, but in general, their approaches all use heuristics and sometimes, guesses, at what truly constitutes a good balance between speed and accuracy, and the result is algorithms in which accuracy is sacrificed in the pursuit of optimal cost.
Accordingly what is needed is a way of improving the approximations used when in or for performing fast transforms.