1. Field of the Invention
The present invention relates to processing and storage of compressed visual data, and in particular the processing and storage of compressed visual data for slow-forward playing, transmission, or editing of an MPEG data stream.
2. Background Art
It has become common practice to compress audio/visual data in order to reduce the capacity and bandwidth requirements for storage and transmission. One of the most popular audio/video compression techniques is MPEG. MPEG is an acronym for the Moving Picture Experts Group, which was set up by the International Standards Organization (ISO) to work on compression. MPEG provides a number of different variations (MPEG-1, MPEG-2, etc.) to suit different bandwidth and quality constraints. MPEG-2, for example, is especially suited to the storage and transmission of broadcast quality television programs.
For the video data, MPEG provides a high degree of compression (up to 200:1) by encoding 8×8 blocks of pixels into a set of discrete cosine transform (DCT) coefficients, quantizing and encoding the coefficients, and using motion compensation techniques to encode most video frames as predictions from or between other frames. In particular, the encoded MPEG video stream is comprised of a series of groups of pictures (GOPs), and each GOP begins with an independently encoded (intra) I frame and may include one or more following P frames and B frames. Each I frame can be decoded without information from any preceding and/or following frame. Decoding of a P frame requires information from a preceding frame in the GOP. Decoding of a B frame requires information from both a preceding and a following frame in the GOP. To minimize decoder buffer requirements, transmission orders differ from presentation orders for some frames, so that all the information of the other frames required for decoding a B frame will arrive at the decoder before the B frame.
A GOP can be “open” or “closed.” A GOP is closed if no prediction is allowed from any frame in a previous GOP. In other words, there are no B or P frames that require any information outside the GOP for decoding. A GOP is open if prediction is allowed from a frame in a previous GOP. In other words, there is a B or P frame that requires information in a frame outside of the GOP for decoding. In the typical case of an open GOP, the transmission order of the GOP begins with an I frame and has at least one B frame following the I frame. In the presentation order, this B frame precedes the first I frame in the GOP, and this B frame requires, for decoding, the last frame of a preceding GOP.
In addition to the motion compensation techniques for video compression, the MPEG standard provides a generic framework for combining one or more elementary streams of digital video and audio, as well as system data, into single or multiple program transport streams (TS) which are suitable for storage or transmission. The system data includes information about synchronization, random access, management of buffers to prevent overflow and underflow, and time stamps for video frames and audio packetized elementary stream packets embedded in video and audio elementary streams as well as program description, conditional access and network related information carried in other independent elementary streams. The standard specifies the organization of the elementary streams and the transport streams, and imposes constraints to enable synchronized decoding from the audio and video decoding buffers under various conditions.
The MPEG-2 standard is documented in ISO/IEC International Standard (IS) 13818-1, “Information Technology-Generic Coding of Moving Pictures and Associated Audio Information: Systems,” ISO/IEC IS 13818-2, “Information Technology-Generic Coding of Moving Pictures and Associated Audio Information: Video,” and ISO/IEC IS 13818-3, “Information Technology-Generic Coding of Moving Pictures and Associated Audio Information: Audio,” which are incorporated herein by reference. A concise introduction to MPEG is given in “A Guide to MPEG Fundamentals and Protocol Analysis (Including DVB and ATSC),” Tektronix Inc., 1997, incorporated herein by reference.
One application of MPEG-2 coded video is video-on-demand (VOD). In a VOD application, the video is stored in a server as MPEG-2 coded video. The server streams MPEG-2 coded video in real time to a subscriber's decoder. The subscriber may operate a remote control providing well-known classical videocassette recorder (VCR) functions including play, stop, fast-forward, fast-reverse, pause, slow-forward and slow-reverse.
Another application of MPEG-2 coded video is an MPEG-2 VCR. In an MPEG-2 VCR application, the video is stored on a digital cassette in MPEG-2 coded video format. The MPEG-2 VCR streams MPEG-2 coded video in real time to an MPEG-2 decoder. The operator may operate a control providing well-known classical VCR functions including play, stop, fast-forward, fast-reverse, pause, slow-forward and slow-reverse.
The third application of MPEG-2 coded video is an MPEG-2 based video editing station. In an MPEG-2 based video editing station, all video materials are stored in MPEG-2 coded video format on tapes or disks. The operators may compile and edit the MPEG-2 coded video in order to create a final broadcast version. One of the typical operations is to slow down the play speed of some portions of the video in order to show the details of action in the scene.
In the case of non-compressed video, the VOD server or VCR or video editing station responds to a slow-forward command by repeating n times each frame for generating an n-times slower play. In the case of I-frame only coded MPEG-2 video, the system may respond to this command by a similar operation of repeating each compressed frame. In the case of IP or IPB coded video, however, simply repeating coded frames will result in decoding errors (creating wrong images) and display order errors. In the following text, IP coded MPEG video is considered as a particular case of IPB coded video. In a wider meaning, I-frame only coded video is also a special case of IPB coded video.
In a typical implementation of the slow-forward function in the case of MPEG-2 IBP compressed video, the system contains an MPEG-2 decoder and an MPEG-2 encoder. To respond to a slow-forward command, the system should decode MPEG-2 video frames, repeat each uncompressed frame by n times and then encode the resulting sequence of frames into MPEG-2 video. This implementation, however, has some disadvantages. The implementation needs at least an MPEG-2 decoder and an MPEG-2 encoder. For real time transmission, the number of decoder/encoder pairs is proportional to the number of simultaneously served streams. This may become very expensive in terms of monetary cost and space. Moreover, each pair of decoding and re-encoding operations may accentuate encoding artifacts introducing additional picture quality degradation.
The slow-forward play function could be achieved by decoder side operations. A receiver could receive a normally coded video stream, decode it and display the decoded pictures at a slower speed. This would require a special decoder or display device. In the VOD environment, a typical set-top box does not have such function. Moreover, with the exception of a file-transfer environment, the data flow of the normally coded video stream must be reduced or periodically interrupted to account for the slow-motion display of the frames from the normally coded video. Therefore, there may be issues of synchronization between the video server or VCR or editing station and the decoder or display device.