Many types of digital video playback systems (also known as digital content servers) are based on multiple video decoding engines or graphics processors that provide video streams to one or more users. Such systems may provide different video streams to different users, or different types of video data to the same user or group of users. They may also provide redundant functionality so that when one engine or processor fails or shuts down, another processor may take over and provide the video stream to the user without interruption.
Digital video systems may employ one or more graphics processing units (GPU) to process video data. A GPU is a graphics and video rendering device for computers, workstations, game consoles, and similar digital processing devices. A GPU is usually implemented as a coprocessor component to the central processing unit (CPU) of the computer, and may be provided in the form of an add-in card (e.g., video card), co-processor, or as functionality that is integrated directly into the motherboard of the computer or into other devices (such as, for example, Northbridge devices and CPUs).
Given the amount of data that may be present in a video stream, most digital graphics processing systems employ some form of compressed or coded representation of moving picture and audio data to reduce the storage and processing overhead associate with video data processing. One of the most popular worldwide standards is the MPEG (Motion Pictures Coding Experts Group) standard, which comprises several different variations for both audio and video data. MPEG-based systems basically encode the original data sequence and then decode the encoded sequence upon playback to reduce the amount of data that needs to be stored and processed, compared to simple storage of each frame of a video sequence. Rather than storing individual pixel information, MPEG encoding stores the movement of objects within images, thus taking advantage of the fact that much information in a video sequence is redundant and only small parts of an image typically change from frame to frame. In processing a video stream, the MPEG encoder produces three types of coded frames. The first type of frame is called an “I” frame or intra-coded frame. This is the simplest type of frame and is a coded representation of a still image. In general, no processing is performed on I-frames; their purpose is to provide the decoder a starting point for decoding the next set of frames. The next type of frame is called a “P” frame or predicted frame. Upon decoding, P-frames are created from information contained within the previous P-frames or I-frames. The third type of frame, and the most common type, is the “B” frame or bi-directional frame. B-frames are both forward and backward predicted and are constructed from the last and the next P or I-frame. Both P-frames and B-frames are inter-coded frames.
Using such a techniques, substantial savings can be gained in file sizes, and typical MPEG systems provide video compression ratios in the range of 8:1 to 30:1. Although data storage requirements are greatly reduced for encoded data streams, a certain amount of processing is required to decode the video data in an MPEG system. This decoding, as well as subsequent rendering of video frames, is often performed by decoding circuitry in a GPU. In a typical multi-GPU (Graphics Processing Unit) based system, each GPU has its own dedicated memory that stores partially decoded video frames. In certain cases, a playback stream may need to be migrated from one GPU to another, such as when a GPU experiences a processing fault or requires maintenance. Similarly, in load-balancing systems, playback streams may be moved from heavily loaded processors to less heavily loaded processors. In many present digital video systems, migration of playback streams from one processor to another requires moving the code from one GPU memory to the next GPU memory. If the migration is done immediately, there may typically be a large amount of state data that must be moved to the new processor's memory, and may include previously decoded frames. This type of failure recovery or redundancy system is inefficient since it can consume significant time and processing bandwidth. For example, the same amount of memory may be consumed for the second processor as for the first processor to store the migrated video data, and the second processor may need to decode the same frame data. What is needed, therefore, is an efficient video data migration system for decoding frame data on multiple graphic processing devices.