The present invention relates to multimedia images and digital communications, and more particularly to methods and arrangements for converting a high definition picture or image to a lower definition image using wavelet transforms.
Many different image and/or video sampling techniques are used in the coding, transmission and reproduction of multimedia images and/or signals such as, for example, still and moving pictures, video, and other related data signals such as audio. These techniques allow multimedia information to be properly coded, transmitted and reproduced by known hardware currently in use. Examples of such techniques are well known in the art and many are presented in the Revised Text for ITU-T Recommendation H.262 ISO/IEC 13818-2:1995, Information technologyxe2x80x94Generic coding of moving pictures and associated audio information: Video dated Mar. 31, 1995.
On Nov. 4, 1994, the ISO (International Organization for Standards) Motion Picture Experts Group (MPEG) adopted a standard for audio/video digital compression known as MPEG-2. This standard allows for consistent digital signal sampling, coding, transmission and reception throughout the world and is well known in the art.
U.S. Pat. No. 5,262,854 issued to Ng on Nov. 16, 1993, entitled Lower Resolution HDTV Receivers, shows a receiver which decimates compressed HDTV digital video signal data to provide lower resolution NTSC images. This system allows high definition signals to be used on lower definition receivers which are currently more commonly in use than high definition receivers.
Similarly, there are many different types of video sampling techniques and digital component video formats commonly used in MPEG video coding. By way of example, there is a high definition 4:4:4 video format which defines the relative relationship between the luminance and chrominance components in a transmitted digital video color signal. In lower definition video sampling formats such as 4:2:2 and 4:2:0 there are less chrominance components per samples of luminance in the digital signal. All three of these sampling techniques are well known in the art. The higher definition sampling techniques and contain more information and therefore produce higher resolution images.
Regardless of the sampling technique, an appropriate display apparatus, such as a monitor or flat panel display, is required to effectively reproduce the encoded image. Given the current development of higher resolution systems and apparatii, a display that is capable of reproducing and displaying a higher resolution image can be very expensive. For example, a high definition television (HDTV) apparatus can cost several thousands of dollars. For many consumers, the cost of a HDTV can be prohibitive when compared to that of a standard definition television, such as, for example, a NTSC compatible apparatus which often costs less than a few hundred dollars.
There are similar cost issues for the producers and broadcasters of the video signals. Producing higher resolution images requires state of the art image recording and generating systems, and often requires that additional bandwidth be provided within the transmission channels in order to handle the increase in information (data) being provided to the consumers.
Broadcasters and consumers are also presented with the concern that there may be a period of time in which only a few consumers have higher resolution display apparatii. This is especially a concern as the technology moves to the next generation of imagery which will incorporate HDTV as the standard.
Thus, there is a need for methods and arrangements that allow the remaining consumers, which possess lower definition television and imaging equipment, to receive the higher definition image data and convert this data to lower definition image data that can be displayed on the lower resolution displays.
HDTV digital video signal decoders are also well known in the art. In conventional MPEG-compatible decoders, there is typically an inverse discrete cosine transform (IDCT) process that is used to decode video-related data that was previously encoded using a discrete cosine transform (DCT) process.
The image data that is encoded/decoded by conventional encoders and decoders typically includes three (3) components per pixel. The components are luminance data (Yc), chrominance data (Uc) and chrominance data (Vc). For example, to display a high definition image, such as, for example, a 1920 by 1080 pixel image, a typical decoder would output 1920 by 1080 pixels of luminance-related data, and 960 by 540 pixels of chrominance-related data. In this example, the resulting data provides a 4:2:0 image having 1920 by 1080 pixels.
The known methods and arrangements for decimating or otherwise reducing the amount of image data attempt to create a subset of the image data that can then be displayed on a lower resolution display. To accomplish this xe2x80x9cdownscalingxe2x80x9d, the known methods and arrangements typically pre-parse or filter the received encoded image data. For example, these methods use masking techniques that eliminate particular data. The remaining portions of the encoded image data are then decoded, for example using a decoder having an IDCT process. The decoded image data is then filtered and/or decimated to further reduce the image for display on a lower resolution display.
By way of example, the amount of information used for a low definition image in certain decoders is xc2xc the amount of information used for the original higher definition image. Thus, for a 1920 by 1080 pixel high definition image, the lower resolution image is 960 by 540 pixels.
It is important to note that this type of known decoder essentially loses video-related information before and after the IDCT process. One result of losing video-related information is that the symmetry of the resulting decoded image can be adversely affected. The loss of symmetry in the resulting decoded image from this type of known decoder can result in a lower quality image, for example, a non-symmetrical 4:2:0 lower-resolution image.
FIGS. 1 and 2 show block diagram depictions of conventional digital video encoding/decoding transmission systems. FIG. 1 is a block diagram depicting a conventional system 100 having an encoder 102 that encodes an image file 104 containing image data 114. The output of encoder 102, i.e., encoded image data, is transmitted or otherwise provided to a decoder 108 through a transmission link 106.
Transmission link 106 can include one or more communication media and/or systems and supporting apparatii that are configured to carry the encoded image data from encoder 102 to decoder 108. Examples of transmission link 106 may include, but are not limited to, a telephone system, a cable television system, a direct or an indirect broadcast television system, a direct or an indirect satellite broadcast system, one or more computer networks and/or buses, the Internet, an intranet, and any software, hardware and other communication systems and equipment associated therewith.
Decoder 108 decodes the received encoded image data and outputs an image 110 that is suitable for reproduction through a display 112. In certain conventional systems, encoder 102 and/or decoder 108 may include one or more processors that each are coupled to a memory. The processor(s) respond to computer implemented instructions stored within the memories to encode or decode image data 114, as required. In other conventional systems, encoder 102 and/or decoder 108 include logic that is configured to encode or decode image data 114, as required.
FIG. 2 is a block diagram depicting a conventional encoding/decoding/transmission system 100 that reduces a higher definition image 114 to a lower definition image 124 that can be displayed on a lower resolution display (not shown). System 100 includes an encoder 102 which implements a DCT algorithm 116 that encodes image data 114 using a DCT algorithm. Decoder 108, in FIG. 2, then operates on the coded image signal using a pre-parser algorithm 118, an IDCT algorithm 120 and a post filter algorithm 122, and outputs a reduced image 124. Pre-parser algorithm 118 decimates, filters, masks, and/or otherwise reduces the amount of encoded image data from encoder 102, and outputs a subset of the received encoded image data to the IDCT algorithm 120 for further processing.
The IDCT algorithm 120 then decodes the subset of the encoded image data and outputs the decoded image data to a post filter algorithm 122. Post filter algorithm 122 further processes and configures the decoded image data to produce a reduced image 124.
Post filter algorithm 122 typically decimates, filters and/or otherwise down-samples the decoded data. Reduced image 124 represents a lower definition image that is suitable for display on a lower resolution display.
FIG. 5 depicts conventional matrix operations 200 associated with a DCT/IDCT algorithms. Matrix D is an 8 by 8 matrix (e.g., a macroblock) of image data that is multiplied by the 8 by 8 DCT/IDCT coefficient matrixes C to CT to produce an 8 by 8 data matrix T.
The data matrix T in FIG. 5 is eventually provided to the decoder 108 through link 106. Table 1 shows a conventional computer program that includes an IDCT process having an inverse fast discrete cosine transform. As illustrated in Table 1, a section 300 has been included to point out the mathematical steps that implement the inverse fast discrete cosine transform. The algorithms contained within the computer program in Table 1, and in particular the coefficients applied in matrix operations 200, are based on the DCT and IDCT which are defined, for example, within referenced sections 304, 308 and 310 in Table 3. However, reduced image 125 has undergone substantial, time consuming and inefficient processing to produce a low quality image.
It is therefore an object of the present invention to provide a system and method for providing a low definition digital video signal from a high definition digital video signal.
It is another object of the present invention to provide a system and method for quickly and efficiently converting a high definition digital video signal into a low definition digital video signal format for display.
These and other objects of the present invention are achieved by incorporating wavelet transforms within the methods and arrangements of the present invention to produce coefficients that are part of discrete wavelet transforms (DWT) and/or inverse discrete wavelet transforms (IDWT).
For example, in accordance with a first preferred embodiment of the present invention, an IDWT process is advantageously included within a decoder to decode and decimate encoded higher definition image data to produce lower definition image data that is suitable for display on a lower resolution display apparatus.
In accordance with one preferred embodiment of the present invention, the decoding and decimation of the DCT encoded image data has been consolidated within the decoding process, and made easier by a decoder having an IDWT process that accomplishes both decoding and decimation. The image data that is decoded by an IDWT configured decoder can be displayed on a lower resolution display as a 4:2:0 video image.
This IDWT decoded 4:2:0 video image is symmetrical because the received encoded image data is not pre-parsed or otherwise filtered prior to being decoded by the IDWT process. Instead, all of the received encoded image data is processed using the IDWT. The IDWT process, as applied to the received encoded image data, inherently decimates or down-samples the amount of video data. The IDWT takes advantage of the reducing capability of one or more wavelet transforms as applied to discrete blocks of received encoded video data through the coefficients of the IDWT.
An additional benefit of the IDWT configured decoder is that, in the case of video, such as MPEG-2 images, motion compensation is accomplished on the decimated output of the IDWT process.
The known decoders typically perform motion compensation on 16 by 16 blocks or matrixes of image data. An IDWT configured decoder, in accordance with the first preferred embodiment of the present invention, will reduce the blocks or matrixes of image data to xc2xc the original size, that is 8 by 8. These 8 by 8 blocks of image data are then momentarily interpolated to the original size and the same motion vectors as would normally be used in the 16 by 16 blocks are applied, however with a reduced number of operations and increased speed. The reduced size of the image data also reduces the memory requirements of the decoder, such as, for example, a cache memory that supports one or more processors that are included in the decoder.
Thus, the present invention provides methods and arrangements that allow a consumer to receive high definition image data and convert the data for display on existing television sets, or on less expensive high resolution displays. The methods and arrangements of the present invention can also be used by the producers and/or broadcasters of the signals to produce fairly high-definition image data.