Various digital applications, such as digital video, involve the processing, storage, and transmission of relatively large amounts of digital data. To reduce the amount of digital data that must be stored and transmitted in conjunction with digital applications, various digital coding techniques, e.g., transform encoding techniques, have been developed. Discrete cosine transform (DCT) encoding is a particularly common form of transform encoding.
One text dedicated to a discussion of DCT encoding and its use is K. R. Rao, P. Yip, Discrete Cosine Transform Algorithms, Advantages, Applications, Academic Press, Inc. (1990) (hereinafter referred to as "the K. Rao and P. Yip reference").
DCT encoding is finding widespread use in the field of digital signal processing and digital image coding in particular. In fact, at least one standard proposed for the coding of motion pictures, commonly referred to as the MPEG-II standard, described in [1] ISO/IEC 13818-2 (1994) Generic Coding of Moving Picture and Associated Audio Information: Video (hereinafter referred to as the "MPEG-2" reference), relies heavily on the use of DCT coding.
When using DCT encoding in the image signal processing context, a DCT encoding operation is performed on digital data representing an image. This is usually done prior to storage or transmission to thereby reduce data storage and/or transmission requirements. A DCT image encoding operation results in what is frequently referred to as a block of DCT coefficients. One commonly used block size is an 8.times.8 block of DCT coefficients. In such embodiments, the 8.times.8 block of DCT coefficients is frequently used to represent a corresponding block of 64 image pixels. FIG. 1A illustrates an 8.times.8 block of DCT coefficients comprising 8 rows (R1-R8) and 8 columns (C1-C8) of DCT coefficients, where each coefficient is represented by an X.
Prior to displaying an image represented by a block of DCT coefficients, an inverse discrete cosine transform (IDCT) operation is normally performed to restore the image data into a format that is suitable for display and/or additional processing. For example, an IDCT operation is normally used to transform image data in the DCT domain, e.g., a block of DCT coefficient values, into data in the pixel domain, e.g., a block of pixel values.
Known IDCT circuits may be characterized as full order IDCT circuits, full order reduced complexity IDCT circuits, or reduced order IDCT circuits.
A full order IDCT circuit performs an IDCT operation on a block of DCT coefficients having the same a number of DCT coefficient values as the original DCT coefficient blocks generated at encoding time. For example, if a DCT encoding operation produced 8.times.8 blocks of DCT coefficients, a full order IDCT operation would involve performing an 8.times.8 IDCT operation.
Full order reduced complexity IDCT circuits force one or more of the DCT coefficients being processed to assume a value of zero, or treat one or more DCT coefficients of a block of DCT coefficients being processed as having a value of zero. This allows various computational simplifications to be made allowing for a full order reduced complexity IDCT circuit to be implemented with fewer multiplications than a regular full order IDCT. Such an IDCT processing approach is discussed in U.S. Pat. No. 5,635,985, which is assigned to the same assignee as the present application, and which is hereby expressly incorporated by reference. Such an IDCT operation is characterized as a reduced complexity full order IDCT operation because the IDCT calculation is based on a DCT coefficient block of the same size as the input to a full order IDCT circuit, but requires fewer computations to perform than a normal full order IDCT operation since some of the DCT values are treated as zero.
As the name implies, reduced order IDCT circuits perform reduced, as opposed to full order, IDCT operations. A reduced order IDCT operation involves performing an IDCT operation using a DCT block size which is smaller than the block size generated at DCT encoding time. The reduction in DCT block size may be achieved by using a zonal filter to select a subblock of DCT coefficients to be used when performing the IDCT operation. For example, if a DCT encoding operation produced an 8.times.8 block of DCT coefficients, a reduced order IDCT operation would involve performing an IDCT operation on, e.g., a 4.times.4 subblock of DCT coefficients selected from a larger 8.times.8 block. The remaining 48 DCT coefficients of the 8.times.8 block are not used in the IDCT operation. FIG. 1B illustrates an 8.times.8 block of DCT coefficients from which the 4.times.4 block of coefficients located in the upper left hand corner have been selected for use when performing a reduced order IDCT operation.
IDCT operations and various other digital image processing operations generally involve performing a substantial number of multiplication and/or addition operations. These operations are performed to, e.g., implement a data matrix manipulation operation required to perform the desired IDCT or image processing operation.
In many applications transform encoded images are received, decoded by performing an IDCT operation, and displayed in real time. For real time video, IDCT operations must be performed in a time period that is equal to, or less than, the time used to display the image represented by the transform coded data being decoded. In many cases, to achieve real time video processing, the large amount of data and the large number of multiplications and additions which must be performed in a relatively short amount of time, requires the use of fast and relatively expensive IDCT circuits.
Prior to display, in addition to an IDCT operation, image processing operations such as downsampling may be performed. Downsampling is generally used to refer to a video data reduction operation that results in some reduction in image resolution.
Operations performed on DCT coded data prior to fully completing an IDCT operation are normally referred to as being performed in the DCT domain. Operations performed on video data after completion of an IDCT operation are generally referred to as being performed in the pixel domain. Various video processing operations tend to be easier to implement in the pixel domain. For this reason, many video processing operations are performed subsequent to completion of an IDCT operation.
A known full order IDCT circuit 20 followed by a downsampling circuit 22 is illustrated in FIG. 2A. Note that the input to the full order IDCT circuit 20 is an N.times.N block of DCT coefficients, e.g., the 8.times.8 block of DCT coefficients illustrated in FIG. 1. The output of the full order IDCT circuit 22 is an N.times.N block of pixel values, i.e., a block of pixel values which is the same size as the input block of coefficient values. The downsampling circuit 22 performs a low pass filtering and decimation operation on the N.times.N block of pixel values input thereto to generate a reduced size block of pixel values. For example, assuming that the input to the IDCT circuit 22 was an 8.times.8 block of DCT coefficients and the downsampling circuit 22 performed decimation by a factor of 2 in both the horizontal and vertical directions, the output of the downsampling circuit 22 would be a 4.times.4 block of pixel values.
When separate additional processing operations, such as downsampling, are performed on digital image data in conjunction with an IDCT operation, the number of math operations that must be performed on a digital image prior to its display is increased beyond the already high number involved in performing the IDCT operation alone. In the context of real time image processing this can result in the use of additional, relatively expensive operators in addition to those already required to perform an IDCT operation.
Generally, images represented by transform coded data may be interlaced or non-interlaced. Interlaced images are formed from a composite of two different images or fields with, e.g., even lines corresponding to a first image and odd lines corresponding to a second image. FIG. 1C illustrates an interlaced image 110 which comprises the combination of a first image 102 and a second image 103. Note that the even lines of pixels R0, R2, R4 and R6 of image 110 are from the first image 102 while the odd lines of pixels R1, R3, R5, and R7 are from the second image 103. In the case where the images correspond to a motion sequence, the first and second images 102, 103 will be different. The combining or averaging of pixel values from different fields of an interlaced image as part of a low pass filtering operation, e.g., during downsampling, can result in distortions being introduced into the image. This is because the averaging involves combining pixel values from two distinct images. In the case of non-interlaced images, where all the pixels of an image correspond to a single image, this problem does not occur. As will be discussed below, one embodiment of the present invention addresses the above discussed problem associated with performing downsampling on interlaced video images.
While being a very different operation from performing a full order IDCT operation followed by downsampling, performing a zonal filtering operation on a block of DCT coefficients followed by a reduced order IDCT operation can provide visual results which are similar to the combination of performing a full order IDCT with downsampling.
FIG. 3 illustrates the use of a zonal filter 32 followed by a reduced order IDCT circuit 34. The zonal filter 32 discards all but a selected sub-block of DCT coefficients, e.g., a 4.times.4 block of coefficients, from the larger N.times.N, e.g., 8.times.8 block of coefficients input thereto. Assuming that the block of input DCT coefficients corresponds to the 8.times.8 block of coefficients illustrated in FIG. 1B, the zonal filter would discard all but the sub-block of coefficients in the upper left hand corner of the input block of coefficients. The retained transform coefficients are supplied to the input of a reduced order, e.g., N/2.times.N/2 IDCT circuit 34. The reduced order IDCT circuit 34 generates a block of pixel values equal to the number of DCT coefficients input thereto. The use of a system of the type illustrated in FIG. 3 is described at page 143 of the K. Rao and P. Yip reference.
Because the approach illustrated in FIG. 3 involves only the selection of a sub-block of received DCT coefficients to process and the performing of an IDCT operation only on the limited number of coefficients included in the selected sub-block, it can be implemented using fewer math operators than the combination of a full order IDCT circuit followed by a downsampling circuit as illustrated in FIG. 2.
Unfortunately, the downsampling ratio's which can be achieved using reduced order IDCTs are somewhat limited. In some cases, e.g., in the case of interlaced images which comprise two distinct fields, the use of reduced order IDCTs may result in an undesirable degradation in the quality of a decoded video image generated from the interlaced video data processed by a reduced order IDCT circuit. It should also be noted that the reduced order IDCT is, in general, mathematically different from the downsampled full order IDCT applied to zonally filtered DCT coefficients.
In view of the above, it becomes apparent that there is a need for improved methods and apparatus for implementing IDCT operations particularly in combination with other image processing operations. It is desirable that any new methods require fewer, and/or simpler to implement, math operations than performing a separate full order IDCT operation in conjunction with an additional image processing operation. In the case of IDCT and downsampling operations in particular, it is desirable that a wide range of downsampling ratios be supported and that the DCT coefficients used as part of the operation need not be limited to a contiguous sub-block of a larger set of DCT coefficients representing the image being processed.
In order to provide for maximum image quality, it is also desirable, particularly when processing interlaced images, that portions of an image corresponding to different fields of an interlaced image be capable of being downsampled on a field as opposed to a frame basis.