1. Field of the Invention
The present invention relates to processing of compressed visual data, and in particular the processing and storage of compressed visual data for playing, transmission, or editing of an MPEG data stream.
2. Background Art
It has become common practice to compress audio/visual data in order to reduce the capacity and bandwidth requirements for storage and transmission. Some of the most popular audio/video compression techniques are defined by the MPEG family of standards. MPEG is an acronym for the Moving Picture Experts Group, which was set up by the International Standards Organization (ISO) to work on the compression of audio/visual information. MPEG provides a number of different variations (MPEG-1, MPEG-2, etc.) to suit different bandwidth and quality constraints. MPEG-2, for example, is especially suited to the storage and transmission of broadcast quality television programs.
For the video data, MPEG provides a high degree of compression (up to 200:1) by transforming 8×8 blocks of pixels into a set of discrete cosine transform (DCT) coefficients, quantizing and encoding the coefficients, and using motion compensation techniques to encode most video frames as predictions from or between other frames. In particular, the encoded MPEG video stream is comprised of a series of groups of pictures (GOPs), and each GOP begins with an independently encoded (intra) I-frame and may include one or more following P-frames and B-frames. Each I-frame can be decoded without information from any preceding and/or following frame. Decoding of a P-frame in general requires information from a preceding (I or P) frame in the same GOP. Decoding of a B-frame in general requires information both from a preceding (I or P) frame in the previous or the same GOP and a following (I or P) frame in the same GOP.
The MPEG-2 standard is documented in ISO/IEC International Standard (IS) 13818-1, “Information Technology-Generic Coding of Moving Pictures and Associated Audio Information: Systems,” ISO/IEC IS 13818-2, “Information Technology-Generic Coding of Moving Pictures and Associated Audio Information: Video,” and ISO/IEC IS 13818-3, “Information Technology-Generic Coding of Moving Pictures and Associated Audio Information: Audio,” which are incorporated herein by reference. A concise introduction to MPEG is given in “A Guide to MPEG Fundamentals and Protocol Analysis (Including DVB and ATSC),” Tektronix Inc., 1997, incorporated herein by reference.
One application of MPEG-2 coded video is video-on-demand (VOD). In a VOD application, the video is stored in a server as MPEG-2 coded video. The server streams MPEG-2 coded video in real time to a subscriber's decoder. The subscriber may operate a remote control providing well-known classical videocassette recorder (VCR) functions including play, stop, fast-forward, fast-reverse, pause, slow-forward and slow-reverse.
Another application of MPEG-2 coded video is an MPEG-2 VCR. In an MPEG-2 VCR application, the video is stored on a digital cassette in MPEG-2 coded video format. The MPEG-2 VCR streams MPEG-2 coded video in real time to an MPEG-2 decoder. The operator may operate a control providing well-known classical VCR functions including play, stop, fast-forward, fast-reverse, pause, slow-forward and slow-reverse.
A third application of MPEG-2 coded video is an MPEG-2 based video editing station. In an MPEG-2 based video editing station, all video materials are stored in MPEG-2 coded video format on tapes or disks. The operators may compile and edit the MPEG-2 coded video in order to create a final broadcast version.
In the above applications of MPEG-2 coded video, it would be desirable to provide an automatic method of detecting scene changes or identifying certain objects in the (visual) scenes. For example, in lieu of a conventional fast-forward function, the viewer could be provided with a function to skip forward to a next scene or skip back to a previous scene, or a function to successively display new scenes in a forward or reverse direction. Such scene display functions would omit the display of repetitious and therefore irrelevant video frames in order for the viewer to find more quickly a new scene from which regular speed forward-play may commence. The detection of a scene change, however, involves a comparison of video information between successive frames, and conventional methods for performing such a comparison are computationally intensive. For real-time detection of scene changes, there is a need for fast, computationally efficient, and reasonably successful detection of scene changes.