In the context of the present invention, an image is a two-dimensional visual representation, wherein each point within the image may have associated therewith one or more characteristics. For example, for a monochrome image, each point may have associated therewith a luminance value. For a color image, each point may have associated therewith a red intensity, a blue intensity and a green intensity. Common image presentation technologies include printed photographic still images, movie images, television images, and computer images. Computer technology has now begun to open whole new areas of image presentation, such as high realism video games, electronic books, and others yet to reach commercialization. These later forms of image presentation all use digital images. That is, images which are stored in digital, and usually binary, form.
Digital image signals are formed by first dividing a two-dimensional image into a grid. Each picture element, or pixel, in the grid has associated therewith a number of visual characteristics, such as brightness and color. These characteristics are converted into numeric form. The digital image signal is then formed by assembling the numbers associated with each pixel in the image in a sequence which can be interpreted by a receiver of the digital image signal.
One reason that these emerging technologies have not appeared sooner is that uncompressed digital image signals contain vast amounts of information, requiring vast quantities of storage space. Furthermore, moving uncompressed digital image signals from one user to another requires a large communication bandwidth to accommodate the large amount of information in a reasonable period of time. Suppose that for a monochromatic (e.g., black and white) image 256 shades of gray are sufficient to represent a uniform luminance scale ranging from black to white. Each pixel occupies eight bits (binary digits) of storage. Thus an image created for display on a typical personal computer screen having a resolution of 640.times.480 pixels occupies a total of 307,200 bytes. That is the storage equivalent of approximately 100 pages of single-spaced text. Extrapolating, a color image can occupy three times that storage space.
In view of the tremendous pressure that the use of images places on storage requirements, there has been a great deal of research into image compression techniques. A standard known as ISO 10918-1 JPEG Draft International Standard/CCITT Recommendation T.81 has emerged as a result of this research. The standard is reproduced in Pennebaker and Mitchell, "JPEG: Still Image Data Compression Standard," New York, Van Nostrand Reinhold, 1993, incorporated herein by reference. One compression technique defined in the JPEG standard, as well as other emerging compression standards, is Discrete Cosine Transform (DCT) coding. Images compressed using DCT coding are decompressed using an inverse transform known as the inverse DCT (IDCT). An excellent general reference on DCTs is Rao and Yip, "Discrete Cosine Transform," New York, Academic Press, 1990, incorporated herein by reference. It will be assumed that those of ordinary skill in this art are familiar with the contents of the above-referenced books.
It is readily apparent that if still images present storage problems for computer users and others, motion picture storage problems are far more severe, because full-motion video may require up to 60 images for each second of displayed motion pictures. Therefore, motion picture compression techniques have been the subject of yet further development and standardization activity. Important standards include ITU-T Recommendations H.261, H.262 and H.263. The ITU-T Recommendation H.262 is commonly known as the MPEG standard, after the Motion Picture Experts Group which developed it. These standards rely in part on DCT coding and IDCT decoding.
The DCT is applied, in accordance with these standards, to each image or video frame in a blockwise fashion. Block sizes that are powers of 2 (2, 4, 8, 16, etc.) are particularly suitable for computationally attractive, fast algorithms for the DCT. In practice, a block size of 8.times.8 is almost always used today for image coding.
It will be apparent to those skilled in this art that the DCT is closely related to the discrete Fourier transform (DFT). In fact, the DCT can be interpreted as a DFT of the extended block of size 2N.times.2N, which is obtained by mirroring the original N.times.N block at its horizontal and its vertical edge. Thus, the DCT coefficient can be interpreted as spectral components of an image block. Low order DCT coefficients correspond to low frequency components in the signal, while high order DCT coefficients correspond to high frequency components.
Given that processing digital image signals using DCT coding provides the desired degree of compression, the pressure on industry is now to find the fastest method by which to perform the DCT and IDCT. As in the field of compression generally, research is highly active and competitive in the field of fast DCT and fast IDCT implementation. Researchers have made a wide variety of attempts to exploit the strengths of the hardware intended to implement the DCT and IDCT by exploiting symmetries and other properties found in the transform and inverse transform, as it is used in practical systems. For example, the Applicants' own method and apparatus disclosed in their U.S. patent application Ser. No. 08/332,535, filed Oct. 31, 1994, pending, incorporated herein by reference, exploits the statistical properties of the transformed signal.
Sometimes, as discussed therein, the image signal does not require the full spatial resolution that is provided by a DCT based coding scheme. This is often true for the color difference signals. In many coding standards, the color difference signals are transmitted at nominally half the horizontal and vertical resolution compared to the luminance signals. For many natural scenes, a spatial resolution of one quarter of that of the luminance signal, both horizontally and vertically, would be sufficient. Consequently, the bandwidth of the color difference signals can be reduced by filtering, and the higher order coefficients of a DCT applied to this signal would be very small or even zero.
The numbers of computations required to perform a DCT increases with its order N. For N a power of 2, the computational complexity (i.e. the number of multiplications and additions) of the fastest DCT algorithms is proportional to N.times.log(N). Hence, a one-dimensional DCT of order N=8 is roughly 3 times as complex as a DCT of order N=4. For a 2D DCT, an 8.times.8 DCT is roughly 9 times as complex as a DCT of order 4.times.4.
It is desired to implement these functions in software, because to do so reduces hardware costs. Specialized hardware embodying a software DCT/IDCT could be made more flexible than an all-hardware implementation. Software which could run on a conventional PC, without special hardware, could eliminate the cost of such hardware entirely. This may be especially advantageous in fields such as video teleconferencing, where the participants are already likely to have access to PCs. A video teleconference system could be implemented at a fraction of the cost of current special-purpose hardware. Unfortunately, fast software DCT and IDCT implementations continue to suffer, relative to their hardware cousins, due to the unusual demands placed on the computer by the required arithmetic operations, particularly multiplications.