The present invention relates to computers, and more particularly to methods and apparatus for processing a Digital Versatile Disk (DVD) data stream using a computer.
The emergence of DVD (Digital Versatile Disk) technology presents a tremendous market growth opportunity for the personal computer (PC). It also presents a significant technical challenge to the highly cost-competitive PC market, namely providing a cost effective PC architecture that provides the digital video performance and quality that the user demands while also remaining flexible enough to support a range of other PC applications.
As known, DVD technology presents a significant leap forward for today""s multimedia PC environment. In addition to providing backward compatibility to CD-ROM, current DVDs provide a storage capacity of between 4.7 GB and 17 GB, which is at least about 8 times the storage capacity of a typical CD. To support this increased storage capacity, DVD devices, such as DVD-ROM drives, typically provide bandwidths in excess of 10 Mb/s. By combining DVD technologies with video compression technologies, such as MPEG-2 video compression techniques, and audio compression technologies, such as MPEG-2 and AC-3 audio techniques, a PC can deliver better-than-broadcast quality television (TV) to a video display device and an audio reproduction device.
DVD also presents an avenue for PC technology to migrate to various new market segments. DVD is being embraced not only by the PC industry, but also by the entertainment and consumer electronics industries. As such, many PC manufacturers and software developers consider DVD to represent the next step in turning desktop PCs into full-fledged entertainment appliances. For example, new products, described as everything from entertainment PCs to set-top PCs and PC-TVs, are beginning to be promoted. By way of example, manufacturers such as Gateway and Compaq are beginning to ship products tailored specifically for delivering video and computer-based entertainment in the home. Additionally, Philips has recently announced its DVX8000 Multimedia Home Theatre product that is targeted for the living room and based on the PC architecture. Recognizing and promoting this trend, Microsoft is attempting to define a unique set of platform requirements for this new breed of xe2x80x9cEntertainment PCxe2x80x9d.
While the future looks very bright for DVD on various PC platforms, there""s the immediate problem of how to make the technology work within the constraints of today""s PC architecture as well as the extremely cost-sensitive reality of the PC marketplace. MPEG-2 standards present an especially difficult problem, because of the amount of processing that is required to decode and decompress the typical 5 Mb/second MPEG-2 video signal into a displayable video signal. Additionally, the accompanying audio signal also needs to be decoded and possibly decompressed. Consequently, PC architectures having DVD capabilities tend to be too costly for the mainstream market and/or lack the necessary performance to perform adequately.
To achieve its goals of quality, storage and data bit-rate, the DVD video standard leverages several existing audio and video compression and transmission standards, including MPEG-2 video and both AC-3 and MPEG-2 audio. By way of example, FIG. 1 depicts a typical DVD processing pipeline in which a DVD data stream is received, for example, from a DVD-ROM drive and/or from a remote device, and converted into a decoded and decompressed digital video signal and corresponding digital audio signal(s).
A DVD data stream consists of sequential data packets, each of which typically includes various system information, video information and audio information. The DVD video decode pipeline 10 depicted in FIG. 1 has been broken down into three high-level processing stages, namely a system stream parsing stage 12, a video processing stage 14, and an audio processing stage 16. Additional information regarding these processing stages and others, and the DVD and MPEG-2 standards are provided in the DVD specification, entitled DVD Specification, Version 1.0, August 1996, and in the MPEG-2 video specification ISO/IEC 13818-1, 2, 3 is available from ISO/IEC Copyright Office Case Postale 56, CH 1211, Genxc3xa8ve 20, Switzerland, each of which are incorporated herein, in their entirety and for all purposes, by reference.
In system stream parsing stage 12, the incoming DVD data stream is split or demultiplexed and/or descrambled, for example using CSS decryption techniques, into three independent streams: a MPEG-2 video stream 15, a MPEG-2 (or AC-3) audio stream 17, and a sub-picture stream 13. By way of example, in certain embodiments, the MPEG-2 video stream 15 can have a bit-rate as high as approximately 9 Mb per second, and the audio stream 17 (MPEG-2 or AC-3) can have a bit-rate as high as approximately 384 Kb per second. The sub-picture stream 13 tends to have a relatively lower bit-rate, and includes sub-picture information that can be incorporated into the final digital video signal as on-screen displays ( OSDs), such as menus or closed captioning data. The MPEG-2 video stream 15 and sub-picture stream 13 are then provided to video processing stage 14 for additional processing. Similarly, the audio stream 17 is provided to audio processing stage 16 for further processing.
Video processing stage 14, as depicted in FIG. 1, includes three sub-stages. The first sub-stage is a DVD sub-picture decode 18 stage in which the sub-picture stream 13 is decoded in accordance with the DVD specification. For example, DVD allows up to 32 streams of sub-picture that can be decoded into a bitmap sequence composed of colors from a palette of sixteen colors. As mentioned above, the decoded sub-pictures are typically OSDs, such as menus, closed captions and sub-titles. In accordance with the DVD specification, the sub-picture(s) are intended to be blended with the video for a true translucent overlay in the final digital video signal.
The second sub-stage of video processing stage 14 is a MPEG-2 decode sub-stage 20 in which the MPEG-2 video stream is decoded and decompressed and converted to a YUV 4:2:2 digital video signal. In accordance with the MPEG-2 specification, MPEG-2 decode sub-stage 20 conducts a Variable Length Decode (VLD) 22, an inverse quantization (IQUANT) 24, an Inverse Discrete Cosine Transform (IDCT) 26, motion compensation 28, and a planar YUV 4:2:0 to interleaved 4:2:2 conversion 30. These processing sub-stages are necessary because the MPEG-2 specifies that certain pictures, called I frames or pictures, are xe2x80x9cintraxe2x80x9d coded such that the entire picture is broken into 8xc3x978 blocks which are processed via a Discrete Cosine Transform (DCT) and quantized to a compressed set of coefficients that, alone, represent the original picture. The MPEG-2 specification also allows for intermediate pictures, between xe2x80x9cIxe2x80x9d pictures, which are known as either predicted (xe2x80x9cPxe2x80x9d pictures) and/or bidirectionally-interpolated pictures (xe2x80x9cBxe2x80x9d pictures). In these intermediate pictures, rather than encoding all of the blocks via DCT, motion compensation information is used to exploit the temporal redundancy found in most video footage. By using motion compensation, MPEG-2 dramatically reduces the amount of data storage required, and the associated data bit-rate, without significantly reducing the quality of the image. Thus, for example, motion compensation allows for a 16xc3x9716 xe2x80x9cmacroblockxe2x80x9d in a P or B picture to be xe2x80x9cpredictedxe2x80x9d by referencing a macroblock in a previous or future picture. By encoding prediction pointersxe2x80x94called motion vectorsxe2x80x94MPEG-2 is able to achieve high compression ratios while maintaining high quality.
The resulting YUV 4:2:2 and decoded sub-picture digital video signals are then provided to the third sub-stage 21 of video processing stage 14 which the YUV 4:2:2 and decoded sub-picture digital video signals are blended together in an alpha blend process 32 to produce a translucent overlay, as described above and in detail in the DVD specification. Next, the blended digital video signal is provided to a YUV-to-RGB conversion process 34, in which the blended digital video signal is converted from a YUV format into a corresponding red-green-blue (RGB) format. The resulting RGB digital video signal is then provided to an image scaling process 36, in which the RGB digital video signal is scaled to a particular size for display. The resulting final digital video signal is then ready to be displayed on a display device, or otherwise provided to other devices, such as video recording or forwarding devices. For example, the final digital video signal can be displayed on a monitor or CRT by further converting the final digital video signal (which is in RGB format) to an analog RGB video signal.
The processing stages/sub-stages associated with DVD processing pipeline 10 tend to be extremely compute intensive. The MPEG-2 video format, which is the most compute intensive portion of pipeline 10, was chosen for DVD technologies because it provides the best quality playback across a range of differing display formats, and is well suited to DVD""s higher bit-rates and storage capacity. For example, MPEG-2 video is flexible and scalable and can be used to support a wide range of display formats and aspect ratios, from standard interlaced NTSC to high-definition, 16:9 progressive scans. One example of a compute intensive MPEG-2 display format, is the Main-Profile, Main-Level (MPML) MPEG-2 format, which supports a 720xc3x97480 pixel display operating at 60 fields/sec or 30 frames per second (fps).
Referring back to FIG. 1, the audio stream is provided by system stream parsing stage 12 to audio processing stage 16. Audio processing stage 16 decodes either Dolby AC-3, with 6 channels (e.g., 5.1 channels) of audio for high-quality surround sound reproduction, as specified for use in NTSC compliant devices, or MPEG-2 (up to 7.1 channels), as specified for in PAL and SECAM compliant devices. The resulting final digital audio signal is capable of being reproduced, for example, by conversion to an analog signal that is provided to an audio reproduction device, such as a sound generating device that converts the digital audio signal to an analog signal, amplifies or otherwise conditions the analog signal, and provides the signal to one or more speakers. As would be expected, decoding the audio stream tends to be much less compute intensive than decoding the video stream.
A vital consideration for PC manufacturers and consumers alike, in providing DVD capabilities, is cost. Because the DVD processes outlined above are compute intensive there is need to deliver cost-effective solutions that essentially reduce the costs associated with the various stages/sub-stages of the DVD processing pipeline. The currently available solutions can be grouped into one of three basic types.
The first type of solution, places the DVD processing task entirely on the processor within the computer, and as such is a software-only solution. By completing all of the DVD pipeline via software (e.g., computer instructions) running on the PC""s processor, there is basically no need to add additional xe2x80x9cDVDxe2x80x9d related hardware components in most PC architectures. However, in order to complete the DVD processing, the PC""s processor would need to be sufficiently powerful enough (e.g., operating speed). Currently, the latest Intel Pentium II processor based platforms are only able to provide frame rates up to about 24 frames per second (fps). To provide greater than about 24 fps, the Pentium II based platforms require additional hardware support, typically to complete the motion compensation process 28. However, given the improvements in processor performance in the past and expected in the future, it appears that it will soon be possible to implement full frame rate DVD decoding via a PC""s processor. The cost associated with such a state-of-the-art processors may, nonetheless, be prohibitive for many PC consumers. Additionally, a DVD playback may place such a burden on the PC""s processor and associated bus(es) and memory that the PC is unable to do little more during the playback. For many users, this operation may prove unacceptable. It is also possible, as witnessed recently, that certain short cuts may be taken by a software-only solution that are not in accord with the DVD specification. For example, some software-only solutions simplify the alpha blend process 36 by simply selecting, on a pixel by pixel basis, to display either the sub-picture pixel or the MPEG derived pixel, rather than actually blending the two pixels together to provide a translucent effect. Again, short cuts such as these tend to diminish the DVD capabilities and can result in non-compliant devices.
The second type of solution, places the DVD processing task entirely on the PC""s hardware, without requiring the processor. This hardware-only solution tends to free up the processor. However, providing such specialized circuitry (e.g., a DVD decoder) can be very expensive and result in significantly increased costs, which can be devastating in the highly competitive PC market. The specialized circuitry can also reduce the performance of the PC by requiring access to the PC""s bus(es), interfaces and memory components, in some PC architectures.
The third type of solution is a hybrid of the first two types of solutions, and requires that the DVD processing tasks be distributed between the PC""s processor (i.e., software) and specialized circuitry (e.g., a decoder) that is configured to handle a portion of the processing. The hybrid solution is flexible, in that it allows for different configurations that can be fine-tuned or modified for a given PC architecture/application. However, there is still an additional expense associated with the specialized circuitry, which can increase the consumer""s cost.
There is a need for cost-effective, improved, and compliant methods and apparatus for providing DVD playback capabilities in a computer, such as, for example, a PC.
The present invention provides an improved and cost effective hybrid solution in the form of methods and apparatus that allow DVD data streams to be played back in a computer system. In accordance with one aspect of the present invention, the methods and apparatus allow for compliant DVD and/or MPEG-2 video playback by conducting specific decoding processes in a graphics engine that is also capable of generating graphics based on command signals.
Thus, in accordance with one embodiment of the present invention, an apparatus is provided for use in a computer system having a processor to support graphics generation and digital video processing. The apparatus includes a set-up engine, a converter and a texture mapping engine. The set-up engine is responsive to at least one command signal from the processor and converts vertex information within the command signal into corresponding triangle information. The triangle information describes a triangle in a three dimensional space. The converter determines digital pixel data for the triangle based on the triangle information. The texture mapping engine modifies the digital pixel data based on the triangle information and at least one digital texture map. As such, the apparatus supports graphics generation. The texture mapping engine also generates motion compensated digital image data based on at least one digital image map and at least one motion vector to support digital video processing.
In accordance with certain embodiments of the present invention, the digital image map is a macroblock containing a digital pixel data from a MPEG generated I and/or P picture. In accordance with further embodiments of the present invention, the texture mapping engine includes at least one bilinear interpolator that determines interpolated digital pixel data based on a first and a second digital pixel data. As such, the bilinear interpolator is used to perform a bilinear filtering of a macroblock that is on sub-pixel sample points to generate one predicted macroblock that is on pixel sample points. In still other embodiments, the texture mapping engine performs a first bilinear filtering based on a first motion vector and on a second bilinear filtering based on a second motion vector, and averages the results of the first bilinear filtering and the results of the second bilinear filtering to generate one predicted macroblock. In certain embodiments, the apparatus is configured to add an IDCT coefficient to the digital pixel data as generated by the texture mapping engine. As such, certain embodiments of the present invention are capable of supporting MPEG-2 motion compensation processing.
In accordance with certain other embodiments of the present invention, the apparatus is further configured to generate a YUV 4:2:2 formatted picture by providing vertical upscaling, and interleaving of a YUV 4:2:0 formatted picture.
The above stated needs and others are also met by a computer system, in accordance with one embodiment of the present invention, that is capable of providing video playback of an encoded data stream. The computer system includes a processor, a data bus mechanism, a primary memory, a display device, and a graphics engine that is configured to generate digital image data based on at least one command signal from the processor, generate motion compensated digital image data based on at least one digital image and at least one motion vector, convert a YUV 4:2:0 formatted picture to a YUV 4:2:2 formatted picture, convert the YUV 4:2:2 formatted picture to a RGB formatted picture, scale the RGB formatted picture, and convert the RGB formatted picture to an analog signal that can be displayed on the display device.
A method is provided, in accordance with the present invention for generating graphics and processing digital video signals in a computer system. The method includes using a graphics engine to generate digital image data, based on at least one command signal by converting vertex information within the command signal into corresponding triangle information, determining digital pixel data for the triangle, based on the triangle information, and modifying the digital pixel data based on the triangle information and at least one digital texture map. The method further includes using the same graphics engine to generate motion compensated digital image data by generating motion compensated digital image data based on at least one digital image map and at least one motion vector.
In accordance with certain embodiments of the present invention, the method further includes using the same graphics engine to convert a YUV 4:2:0 formatted picture to a YUV 4:2:2 formatted picture by offsetting at least a portion of the YUV 4:2:0 formatted picture and selectively mapping samples of the YUV 4:2:0 formatted picture to a corresponding destination picture to provide a vertical upscaling, and selectively arranging byte data of the destination picture to interleave the byte data and generate the YUV 4:2:2 formatted picture.