The present invention relates to the digital processing of video to be displayed on a video display, and more particularly, to control of the display pipeline on a reduced instruction set processor between decoded digital video and a display output.
Techniques for digital transmission of video promise increased flexibility, higher resolution, and better fidelity. Recent industry collaborations have brought digital video closer to reality; digital video transmission and storage standards have been generated, and consumer digital video products have begun to appear. The move toward digital video has been encouraged by the commercialization of digital technologies in general, such as personal computers and compact discs, both of which have increased consumer awareness of the possibilities of digital technology.
Personal computers, which have recently become common and inexpensive, contain much of the computing hardware needed to produce digital video, including a microprocessor/coprocessor for performing numeric calculations, input and output connections, and a large digital memory for storing and manipulating image data. Unfortunately, personal computers are not suitable for consumer digital video reception, because the microprocessor in a personal computer is a general purpose processor, and typically cannot perform the calculations needed for digital video fast enough to produce full-motion, high definition video output.
Accordingly, special purpose processors, particularly suited for performing digital video-related calculations, have been developed for use in digital video receivers for consumer applications. The first attempts in the early 1990s included separate application specific integration circuits (ASICs) for audio and for video processing. In addition, these early ASICs performed only low-level functions, and thus burdened a host processor with most of the management of the audio and video processing. These ASICs relied on standard audio/video synchronization and simple error concealment techniques all to be performed by the host processor.
Thereafter, some audio/video processing components were introduced that provided some integration of audio and video decoding with some primitive levels of features. However, these components largely shared the same drawbacks as the early ASICs in that host processors largely managed the audio and video processing.
Other audio/video processing components attempted to provide more features in a cost effective way by combining more firmware functionality onto the same integrated circuit (IC). However, such inflexible approaches narrowed applications to which such ICs could be used and narrowed the functionality when used. Design choices made in firmware constricted the Application Program Interface (API).
A more flexible approach has been made by providing a specific processor with a high-speed architecture which allows programming flexibility with its open, multi-level Application Programming Interface (API). This specific processor is disclosed in commonly-assigned, copending U.S. patent application Ser. No. 08/865,749, entitled SPECIAL PURPOSE PROCESSOR FOR DIGITAL AUDIO/VIDEO DECODING, filed by Moshe Bublil et al. on May 30, 1997, which is hereby incorporated by reference herein in its entirety, and a memory controller for use therewith is disclosed in commonly-assigned, copending U.S. patent application Ser. No. 08/846,590, entitled xe2x80x9cMEMORY ADDRESS GENERATION FOR DIGITAL VIDEOxe2x80x9d, filed by Edward J. Paluch on Apr. 30, 1997, which is hereby incorporated herein in its entirety.
The above-referenced U.S. patent applications describe an application specific integrated circuit (ASIC) for performing digital video processing, which is controlled by a reduced instruction set CPU (RISC CPU). The RISC CPU controls computations and operations of other parts of the ASIC to provide digital video reception. As is typical of CPU""s of many varieties, the CPU described in the above-referenced U.S. patent applications supports flow control instructions such as BRANCH, CALL and RETURN, as well as providing hardware interrupt services.
Due to the limitations of the RISC CPU, a number of functions are provided in the operating system rather than in hardware. A specific operating system of this kind is disclosed in commonly-assigned, copending U.S. patent application Ser. No. 08/866,419, entitled TASK AND STACK MANAGER FOR DIGITAL VIDEO DECODING, filed by Taner Ozcelik et al. on May 30, 1997, which is hereby incorporated by reference herein in its entirety; and software running under control of this operating system for controlling high-level digital video decoding functions is described in U.S. patent application Ser. No. 09/177,214 entitled xe2x80x9cCOMMAND MANAGERxe2x80x9d filed by Cem I. Duruoz et al. on Oct. 22, 1998, which is hereby incorporated by reference herein in its entirety; and U.S. patent application Ser. No. 09/177,261 entitled METHOD AND APPARATUS FOR A VIRTUAL SYSTEM TIME CLOCK FOR DIGITAL/AUDIO/VIDEO PROCESSOR filed by Cem I. Duruoz et al. on Oct. 22, 1998, which is hereby incorporated by reference herein in its entirety. Thus, certain functions like scheduling audio/video processing and synchronization such processes are handled by a digital audio/video processor, unburdening a host processor, while providing intimate control of such processes by the host when desirable.
One aspect of the aforementioned digital audio/video processor is accommodating various digital video fornats. For instance, the industry sponsored Motion Pictures Expert Group (MPEG) chartered by the International Orga nization for Standardization (ISO) has specified a format for digital video and two channel stereo audio signals that has come to be known as MPEG-1, and, more formally, as ISO-11172. MPEG-1 specifies formats for representing data inputs to digital decoders, or the syntax for data bitstreams that will carry programs in digital formats that decoders can reliably decode. In practice, the MPEG-1 standards have been used for recorded programs that are usually read by software systems. The program signals include digital data of various programs or program components with their digitized data streams multiplexed together by parsing them in the time domain into the program bitstreams. The programs include audio and video frames of data and other information. MPEG-1 recordings may recorded on an optical disk and referred to as a Video Compact Disc, or VCD.
An enhanced standard, known colloquially as MPEG-2 and more formally as ISO-13818, has more recently been agreed upon by the ISO MPEG. Products using MPEG-2 are often provided on an optical disk referred to as a Digital Video Disc, or DVD. This enhanced standard has grown out of needs for specifying data formats for broadcast and other higher noise applications, such as high definition television (HDTV), where the programs are more likely to be transmitted than recorded and more likely to be decoded by hardware than by software. The MPEG standards define structure for multiplexing and synchronizing coded digital and audio data, for decoding, for example, by digital television receivers and for random access play of recorded programs. The defmed structure provides syntax for the parsing and synchronizing of the multiplexed stream in such applications and for identifying, decoding and timing the information in the bitstreams.
The MPEG video standard specifies a bitstream syntax designed to improve information density and coding efficiency by methods that remove spacial and temporal redundancies. For example, the transformation of blocks of 8xc3x978 luminance pels (pixels) and corresponding chrominance data using Discrete Cosine Transform (DCT) coding is contemplated to remove spacial redundancies, while motion compensated prediction is contemplated to remove temporal redundancies. For video, MPEG contemplates Intra (I) frames, Predictive (P) frames and Bidirectionally Predictive (B) frames. The I-frames are independently coded and are the least efficiently coded of the three frame types. P-frames are coded more efficiently than are I-frames and are coded relative to the previously coded I- or P frame. B-frames are coded the most efficiently of the three frame types and are coded relative to both the previous and the next I- or P-frames. The coding order of the frames in an MPEG program is not necessarily the same as the presentation order of the frames. Headers in the bitstream provide information to be used by decoders to properly decode the time and sequence of the frames for the presentation of a moving picture.
The video bitstreams in MPEG systems include a Video Sequence Header containing picture size and aspect ratio data, bit rate limits and other global parameters. Following the Video Sequence Header are coded groups-of-pictures (GOPs). Each GOP usually includes only one I-picture and a variable number of P- and B-pictures. Each GOP also includes a GOP header that contains presentation delay requirements and other data relevant to the entire GOP. Each picture in the GOP includes a picture header that contains picture type and display order data and other information relevant to the picture within the picture group.
Each MPEG picture is divided into a plurality of macroblocks (MBs), not all of which need be transmitted. Each MB is made up of 16xc3x9716 luminance pels, or a 2xc3x972 array of four 8xc3x978 transformed blocks of pels. MBs are coded in Slices of consecutive variable length strings of MBs, running left to right across a picture. Slices may begin and end at any intermediate MB position of the picture but must respectively begin or end whenever a left or right margin of the picture is encountered. Each Slice begins with a Slice Header that contains information of the vertical position of the Slice within the picture, information of the quantization scale of the Slice and other information such as that which can be used for fast-forward, fast reverse, resynchronization in the event of transmission error, or other picture presentation purposes.
The macroblock is the basic unit used for MPEG motion compensation. Each MB contains an MB header, which, for the first MB of a Slice, contains information of the MB""s horizontal position relative to the left edge of the picture, and which, for subsequently transmitted MBs of a Slice, contains an address increment. Not all of the consecutive MBs of a Slice are transmitted with the Slice.
Video images to be viewed by a user are normally produced in a known manner by a scanning process across a video display. The choice of a particular scanning process to be used is generally a design trade off among contradictory requirements of bandwidth, flicker, and resolution. For normal television viewing, generally, an interlaced scanning process uses frames that are composed of two fields sampled at different times. Lines of the two fields are interleaved such that two consecutive lines of a frame, that is, a full display, belong to alternate fields. An interlaced scanning process represents a vertical temporal trade off in spatial and temporal resolution. Thus, slow moving objects are perceived with higher vertical detail, while fast moving objects are perceived with a higher temporal rate, although at half the vertical resolution.
The presentation of MPEG video involves the display of video frames at a rate of, for example, twenty-five or thirty frames per second (depending on the national standard used, PAL or NTSC, for example). Thirty frames per second corresponds to presentation time intervals of approximately 32 milliseconds. Thus, MPEG-2 video decoders must decode signals with interleaved video in what has been called, and referred to above as, the CCIR-601 (and which has also been called the ITU-R) color video format, where each pixel is coded as a luminance 8 bit value sampled at a 13.5 MHZ rate along with a red chrominance value and a blue chrominance value, 8 bits each and sampled at a 6.75 MHZ rate. In this format, the video frames are 720 pels per line, and either 480 lines per frame at 30 frames per second or 576 lines per frame at 25 frames per second.
It is also known, pursuant to the MPEG-2 standard, that different video formats may be utilized in order to reduce the amount of data required. MPEG-2 video coding is optimized for the CCIR-601 4:2:2 interlaced format and, therefore, the 4:2:2 interlaced format is normally used in decoding video signals. In a MPEG-2 4:2:0 video format, the number of samples of each chrominance component, Cr or Cb, is one-half the number of samples of luminance, both horizontally and vertically. In contrast, with the MPEG-2 4:2:2 video format, in each frame of video, the number of samples per line of each chrominance component, Cr or Cb is one-half of the number of samples per line of luminance. However, the chrominance resolution is full vertically, that is, it is the same of that of the luminance resolution vertically. In the normal course of video signal processing, the 4:2:0 format is used, and that format is interpolated to a 4:2:2 format for the video display monitor.
In addition to the above variations, a video signal processor must be able to process video that has been derived from a wide range of sources. For example, the program material may be derived from 16 mm, 35 mm, or 70 mm film, cinemascope film, or wide screen film. Each of those film sources has a different display size, which is often calibrated in terms of its image aspect ratio, that is, the ratio of picture width to height. For example, the aspect ratio of 16 mm film, wide screen film, 70 mm film, and cinemascope film are 1.33, 1.85, 2.10, 2.35, respectively. The aspect ratio of NTSC, PAL, and SECAM TV is 1.33, whereas the aspect ratio for HDTV is 1.78. Given those variations in aspect ratio in combination with different sizes of video displays, it is often required to adjust the horizontal width or vertical height of the displayed image. Thus, the video signal processor must be capable of driving display monitors such that images having different aspect ratios may be displayed.
Known devices for controlling the xe2x80x9cdisplay pipelinexe2x80x9d between the video as decoded and as displayed have generally required a separate specialized processor, additional memory, as well as significant manipulation of the data. Such processors add cost, require too much overhead and introduce too much delay into the video processing function. More often, such processors have no flexibility to accommodate the range of possible input and output formats.
In accordance with the principles of the present invention, these difficulties are overcome by a novel display master control method and apparatus for controlling a digital video display data pipeline within a reduced instruction set processor to advantageously control spatial conversions based on format of a program signal input, user preferences, and/or video display format.
Specifically, the display master control sets aspect ratio control, display base address control for appropriately spatial positioning, resizing control for MPEG1 SIF (defined in both VCD and DVD specifications), interpolation control for handling different chroma locations for various program signal input, fade and blending control, and television system conversion control (e.g., NTSC, PAL).
The above and other objects and advantages of the present invention shall be made apparent from the accompanying drawings and the description thereof.