1. Field of the Invention
The present invention relates generally to the compression, cataloging and viewing of full motion videos and, more particularly, to the processing of compressed video data.
2. Description of Related Art
The infrastructure and process required to create and operate a video archive in the digital domain are well known in the broadcast video industry. The archiving process generally begins by digitizing and compressing the analog video using MPEG-1 or MPEG-2 compression, then moving the compressed video file to a long term storage. To preserve the contribution quality of the video, broadcasters generally select a high compressed bitrate (i.e., 15–40 Mbps), which allows the original video to be recovered with relatively high fidelity in spite of the lossiness of the MPEG compression scheme.
The high bitrate of the compressed video, however, presents considerable problems to the broadcaster's local area network and computer workstation infrastructure, when the video must be distributed for viewing and post-production work. The high network bandwidth and the amount of time required to transfer the assets throughout the plant places an upper limit on the number of concurrent transfers and severely constrains productivity. In response to this bandwidth problem, broadcasters create an additional copy of the video at a much lower compressed bitrate (i.e., 1.5–4 Mbps). This low bitrate file, referred to as a ‘proxy’ or ‘browse’ file, enables users to quickly download the video or to view it directly on computer monitors by utilizing a streaming video server. To facilitate the viewing of video assets outside the local area network, a second proxy file is often encoded at a very low bitrate (56–1000 Kbps), for streaming over low speed terrestrial lines.
After ingestion of the video, the next step in the archiving process is to create an entry for the video in the video library catalog. This entry contains metadata, which is information pertinent to the video. The contents and format of a video catalog record, normally broadcaster unique, facilitate the search and retrieval of video clips within the broadcaster's video library. Presently, there are commercially available video catalog applications (catalogers) that will automatically extract from an MPEG-1 or MPEG-2 video file metadata, such as closed caption text and the text of the actual audio program, obtained via speech recognition technology. Catalogers further extract metadata from the video by performing scene change analysis and creating a bitmap of the first frame after each cut or major scene transition. These bitmaps, referred to individually as a ‘thumbnail’ or collectively as a storyboard, are considered essential metadata because they enable the end user to determine very quickly the video content. Absent the storyboard, the end user is forced to view the video or, at a minimum, fast forward through a video to find the desired video segment.
A common feature of video catalog applications is to allow the end user, when viewing the catalog record and metadata, to play the proxy file by double clicking on any thumbnail. The MPEG player, nested inside the application's window, begins to play the video from the timecode associated with the thumbnail. The player performs this function by sending to the streaming video server a play-from-offset request. A limitation of the MPEG syntax permits a video to be randomly accessed only at the Group of Pictures (GOP) header level. More specifically, in order to start playing randomly, within an MPEG file, the player must decode a pack header to obtain system timing, a system header to identify the audio and video streams, a sequence header to parse the video format, a sequence extension for further video format information, and a GOP header to begin decoding on an ‘I’ frame.
Unlike other block based compression algorithms, the frame size in MPEG is variable, so frames must be located by sequentially reading the file. To implement the play-from-offset capability, MPEG players commonly have to perform a rough calculation based on the multiplex bitrate. For example, to start playing at offset 00:00:10:15 (10 seconds, 15 frames) into an 8 Mbps video, the player will use the following formula:
      byte    ⁢                  ⁢    offset    =            (                        (                      bytes            ⁢                          /                        ⁢            second                    )                *        seconds            )        +          (                                    (                          bytes              ⁢                              /                            ⁢              second                        )                                (                          frames              ⁢                              /                            ⁢              second                        )                          *        number        ⁢                                  ⁢        of        ⁢                                  ⁢        frames            )      10,500,000=((8,000,000/8)*10)+(((8,000,000/8)/30)*15)
Since this formula is only an approximation, and the player has no knowledge of where the GOP boundaries are, the player will arbitrarily subtract a number of bytes from this result in order to ensure playing is begun prior to the target frame. This crude method is aesthetically unpleasing because up to 15 frames of corrupted video may be played out before the encoder is able to construct a full, error-free frame of video. Additionally, this formula assumes the video was compressed at a constant bitrate. If a variable bit rate compression scheme was used, there is no correlation between bitrate, file size and video duration.
Another conventional technique of implementing play-from-offset is for the video server to build an index file containing the offset of each GOP in the video. The player then passes the streaming server a time offset, and the server performs a table lookup to determine the file byte offset. The drawback of this method is that the server must build and manage an additional file for each video. If the video MPEG file is migrated to a near-line tape storage, both files must to written to and restored from tape. This also complicates the inter-server transfer of videos commonly performed to maintain load balancing or import video files from other content providers.
Yet another problem with a server-based implementation of play-from-offset occurs due to the fact that the user requested frame will most likely not coincide with a GOP boundary. The player has no knowledge of the starting frame timecode, so it cues on the first picture of the GOP and the end user is forced to single step to the intended frame. This behavior is not acceptable for the broadcast video industry. Non-linear video devices are expected to cue on the target frame with minimal delay.
Therefore, there is a need for a system and method for automatic insertion of a file index information within an existing MPEG video file, so that an MPEG player has a priori knowledge of GOP SMPTE timecodes and file offsets, for precise play-from-offset capability. This system and method should be able to perform without use of a separate index file, in a manner that ensures the MPEG video file will continue to be decodable, without error, by any compliant MPEG decoder engine.