1. Field of the Invention
The present invention relates to the field of data reduction. More particularly, the present invention relates to methods and apparatus for reducing video data.
2. Background
In multimedia based products for the personal computer, data reduction is a commonly used function when processing and manipulating the digital image. Data reduction is useful during the capture and playback cycle of a full-motion video window with a frame buffer memory subsystem. The frame buffer picture elements (pixels) comprise a rectangular grid of image data that are filtered, stored and displayed using multiple color spaces: red, green and blue (RGB) is often used for graphic data; and the luminance/chrominance (Y, UV) format is often used for full-motion video data. Due to memory bandwidth limitations and differences between source image size and display size, it is desirable to decrease the amount of data processed while maintaining an acceptable image quality.
Current video data reduction techniques have been applied to YUV and RGB data. Such prior art reduction systems typically utilize bilinear interpolation and the dropping of intermediate lines, resulting in relatively poor image quality.
Such prior art reduction systems also typically perform data reduction in one functional module. This is due to real-time constraints, which prevent distributed video data reduction under prior methods. Video data reduction is not done in the background due to limited memory bandwidth. Background processes typically are assigned a low priority for frame memory accesses, creating a bottleneck.
Finally, such reduction systems require interpolation of UV (chrominance) data when converting from the YUV 4:2:0 to YUV 4:2:2 formats. This requires extra hardware and processor utilization. A need exists, to eliminate interpolation in the conversion from the YUV 4:2:0 to the YUV 4:2:2 format.
A compressed digital video stream is made up of a number of still frames, or pictures. Referring first to FIG. 1, a representation of a frame 10 is shown. Each frame 10 comprises a plurality of horizontal slices 12, each of which includes a plurality of macroblocks 14. Macroblock size is typically 16xc3x9716 pixels. Such a macroblock is typically further divided into four blocks 15. Block size is 8xc3x978 pixels. A frame, or picture, resolution of 720xc3x97576 is defined by 720xc3x97576 pixels which correspond to 45xc3x9736 macroblocks, or 90xc3x9772 blocks.
Many international standards, such as the Moving Picture Expert Group version 2 (MPEG 2), International Standards Organization/International Electrotechnical Commission (ISO/IEC) standard, std. 13818-2:1996, published May 16, 1996, and the MPEG 1 standard, ISO/IEC std. 11172-2:1993, published Aug. 12, 1993, are used for digital video compression and decompression. Each MPEG 2 macroblock comprises a plurality of pixels, each of which is defined by color space components. A color space is a mathematical representation for a color. Different color spaces provide different ways of representing a color which will ultimately be displayed in a video system. For example, the red, green, and blue (RGB) color space is commonly used in computer graphics. Similarly, the YUV color space represents the luminance or xe2x80x9clumaxe2x80x9d component Y, or black and white portion, as well as the color difference or xe2x80x9cchrominancexe2x80x9d components U and V. A macroblock in YUV format contains data for all Y, U and V components.
Pixels in each macroblock 14 are traditionally stored in blocks since they are compressed. Three types of macroblocks are available in MPEG 2. Referring to FIG. 2A, the 4:2:0 macroblock consists of four Y blocks 17, one U block 18, and one V block 19. In the 4:2:0 chroma format, for each 16xc3x9716 pixel Y block 17, the corresponding U and V blocks have size 8xc3x978 pixels. In other words, for every four Y pixels, one U and one V pixel are shared. Referring to FIG. 3B, the MPEG 2 U and V pixel data is located at half pixel locations in the Y direction. Referring to FIG. 3A, MPEG 1 U and V pixel data is located at half pixel locations in both the X and Y directions. Most MPEG decoders use the 4:2:0 chroma format for internal storage.
Referring to FIG. 2B, a 4:2:2 macroblock consists of four Y blocks 20, two U blocks 21, and two V blocks 22. In the 4:2:2 format, each 16xc3x9716 pixel Y block 20 is associated with one U and one V block having size 16xc3x978 pixels. In this format, two Y pixels share one U and one V pixel, as shown in FIG. 3C.
Referring to FIG. 2C, a 4:4:4 macroblock consists of four Y blocks 25, four U blocks 26, and four V blocks 27. Each 16xc3x9716 pixel Y block is associated with one U and one V block of size 16xc3x9716. Therefore, the 4:4:4 format stores an equal number of Y, U and V pixels, as shown in FIG. 3D.
Typically, video data in block format must be scaled during video processing because the source image size may differ from the display size. When reduction is required, it is desirable to create a reduced image while maintaining as much information from the original image as possible. The simplest form of reduction is pixel dropping, where (m) out of every (n) pixels are thrown away both horizontally and vertically. Data is xe2x80x9cdroppedxe2x80x9d when the reduced image excludes pixel information from the original image. For example, a reduction factor of one third (resulting in an image that is one ninth as large as the original), results in two out of every three pixels being discarded in both the horizontal and vertical directions. Reduction using pixel dropping is not recommended if the resulting image is to be further processed due to the introduction of aliasing components. A xe2x80x9cdecimation filterxe2x80x9d can be used, which bandwidth-limits the image horizontally and vertically before decimation. However, each scaling factor requires different filter coefficients.
An improvement in video quality of scaled images is possible using linear interpolation. Bilinear interpolation combines the linear interpolation process in both the horizontal and vertical directions. When an output sample falls between two input samples (horizontally or vertically), the output sample is computed by linearly interpolating between the two input samples. However, scaling to images smaller than one half of the original may result in dropped data.
Linear interpolation may be performed on the Y, UV data. For example, The Y (luminance) value for the new reduced pixel is calculated using the following equation:
In=(Fn*Pn)+(Fn+1*Pn+1)(Fn+Fn+1=1)
where Fn and Fn+1 are weight factors for neighboring pixels Pn and Pn+1 of the new reduced pixel In. The weight factors are calculated from the distance from In to the neighboring pixel. However, those of ordinary skill in the art will recognize that alternative weight factor criteria are possible.
Although linear interpolation was illustrated in one dimension, those of ordinary skill in the art will recognize the reduction method may be applied in two dimensions.
Other approaches include higher order filters. Generally, the higher the order of the interpolation, n, the better the overall response. Nth order filters, where N is greater than one, allow reduction scales up to N+1):1 without dropping data. This is illustrated in Table 1 below.
Higher order filters require significantly more hardware and memory bandwidth than pixel dropping or linear interpolation. The hardware required to implement such prior art reducers is shown in Table 2 below. The drop pixel and nearest neighbor methods require a minimum amount of hardware, but yield relatively low quality images. Linear interpolation requires additional hardware and yields better images, but data is dropped at reduction scales greater than 2:1. Nth order filters yield significantly better images, but require much more hardware. A need exists for a method and apparatus for creating reduced video images having a reduction scale greater than 2:1, without dropping data, and with a minimal amount of hardware.
A block within a macroblock within a frame is received from a digital video data stream. The macroblock comprises a plurality of color space components, each color space component having at least one block. Each block comprises a plurality of lines, with each line comprising a plurality of pixels. The macroblock has a width defined by a plurality of pixels. The block is reduced by a power of two and stored to memory.