1. Field of the Invention
The present invention relates generally to the compression, cataloging and viewing of full motion videos and, more particularly, to the processing of compressed video data.
2. Description of Related Art
The infrastructure and process required to create and operate a video archive in the digital domain are well known in the broadcast video industry. The archiving process generally begins by digitizing and compressing the analog video using MPEG-1 or MPEG-2 compression, then moving the compressed video file to a long term storage. To preserve the contribution quality of the video, broadcasters generally select a high compressed bitrate (i.e., 15-40 Mbps), which allows the original video to be recovered with relatively high fidelity in spite of the lossiness of the MPEG compression scheme.
The high bitrate of the compressed video, however, presents considerable problems to the broadcaster""s local area network and computer workstation infrastructure, when the video must be distributed for viewing and post-production work. The high network bandwidth and the amount of time required to transfer the assets throughout the plant places an upper limit on the number of concurrent transfers and severely constrains productivity. In response to this bandwidth problem, broadcasters create an additional copy of the video at a much lower compressed bitrate (i.e., 1.5-4 Mbps). This low bitrate file, referred to as a xe2x80x98proxyxe2x80x99 or xe2x80x98browsexe2x80x99 file, enables users to quickly download the video or to view it directly on computer monitors by utilizing a streaming video server. To facilitate the viewing of video assets outside the local area network, a second proxy file is often encoded at a very low bitrate (56-1000 Kbps), for streaming over low speed terrestrial lines.
After ingestion of the video, the next step in the archiving process is to create an entry for the video in the video library catalog. This entry contains metadata, which is information pertinent to the video. The contents and format of a video catalog record, normally broadcaster unique, facilitate the search and retrieval of video clips within the broadcaster""s video library. Presently, there are commercially available video catalog applications (catalogers) that will automatically extract from an MPEG-1 or MPEG-2 video file metadata, such as closed caption text and the text of the actual audio program, obtained via speech recognition technology. Catalogers further extract metadata from the video by performing scene change analysis and creating a bitmap of the first frame after each cut or major scene transition. These bitmaps, referred to individually as a xe2x80x98thumbnailxe2x80x99 or collectively as a storyboard, are considered essential metadata because they enable the end user to determine very quickly the video content. Absent the storyboard, the end user is forced to view the video or, at a minimum, fast forward through a video, to find the desired video segment. An additional feature of prior art catalogers is the capability to randomly access and play the proxy video file by double clicking on a storyboard thumbnail.
Further productivity gains can be achieved if the proxy file is a replica of the high-resolution video, where both files begin on the same video frame and have equal duration. When the browse file is a true proxy, a video production engineer is able to import several proxy files into a video editor and produce a program, creating an edit decision list (EDL). This EDL is subsequently exported to a high quality video editing suite that downloads the high-resolution version of the videos from the archive and executes the EDL to produce the air-ready material. Ideally, the broadcast editing suite retrieves from the broadcast server or archive only those segments of the high-resolution file that are specified in the EDL.
Producing a high-resolution video and one or more frame accurate proxy files is problematic because two or more MPEG encoders and a source playout device must be started frame accurately, and the encoders must be capable of extracting SMPTE timecode from the vertical blanking interval and storing the timecode in the MPEG Group of Pictures (GOP) header, although some broadcasters may allow the encoders to encode alternately the locally produced house SMPTE timecode. Moreover, the encoders must not drop or repeat any frames during the encoding process, and the encoders must stop on the same video frame.
Although there are commercially available MPEG encoders that are capable of producing such proxy files, these encoders are very expensive and are not economical for a broadcaster planning to operate many ingest stations. Moreover, these high-end encoders store the MPEG data in a vendor proprietary elementary stream format, which makes them uninteroperable with other MPEG decoders. Thus, video files sent to another broadcast facility must be first remultiplexed into a MPEG compliant format. Moreover, it is undesirable from a business perspective to use a nonstandard storage format. Furthermore, video quality and reliability are the normal criteria for selecting an encoder vendor. Clearly, a need exists to create proxy files using good quality, but less capable, MPEG encoders. An encoder that fails to store SMPTE time in the GOP header, for example, should not be eliminated from consideration, if it meets all other broadcaster requirements.
There is a obviously a need for recording SMPTE timecodes. However, there are problems that occur when dealing with recording timecodes. There are two timecodes associated with every video: an absolute and relative timecode. The absolute timecode is the SMPTE timecode recorded as the video is being shot. It usually reflects the actual time of day, but if the camera operator fails to properly set the SMPTE timecode generator on the camera, it may indicate any random clock time. Reporters and producers taking notes will record the SMPTE timecode while filming, to enable them to quickly find important footage during post-production. It is for this reason that many archive librarians insist on preserving the absolute timecode as essential metadata when compressing and cataloging video. However, the absolute timecode on a source video tape can be anomalous (e.g., missing, discontinuous, jump backwards in time, non-incrementing, non-drop frame mode, etc.).
The relative timecode is a timecode that is relative to the start of the video, and is often referred to as elapsed time. Many producers prefer to use relative timecode instead of absolute timecode during editing sessions, because it can simplify the arithmetic associated with calculating video clip duration. More importantly, it is more dependable than the absolute timecode.
The syntax and semantics of MPEG-2 are described in detail in the Moving Pictures Expert""s Group (MPEG) standard entitled Coding of Moving Pictures and Associated Audio ITU Recommendation H.262, which is incorporated herein by reference. One of the shortcomings of the MPEG standard is that only one timecode is recorded, and this timecode is placed in the GOP header that typically occurs every 12-15 frames. Thus, if the absolute timecode abruptly changes between the two GOP headers, the change in SMPTE time is not registered until the next GOP header, and therefore the MPEG file does not accurately reflect the absolute timecode of the source. This mismatch in SMPTE time would result in EDL errors, if absolute timecode were to be used when editing with the proxy file. Some vendor MPEG encoders are capable of recording the timecode of each frame in a user defined data field within the video data. However, there is no standard for formatting these data, and only the vendor""s own decoder is capable of decoding the user data packets. Therefore, there is a present need for encoding both absolute and relative timecode into a proxy file on a frame basis, which will accurately reflect the timecodes of the associated high-resolution video file.
There is also a need for recording timecodes in non-proxy files. Many broadcasters have an established video archive of medium-to-low resolution MPEG files in various formats for which there are no matching high-resolution files. These standalone video files are used to browse or search a video collection which is maintained on analog/digital tape. In order to located the original source, the MPEG browse file must contain the absolute timecode. It would be cost prohibitive for a broadcaster with hundreds or thousands of hours of tape to re-encode the archived video in order to insert proper timecode. Accordingly, there is a need to process existing MPEG assets and retrofit them with accurate timecode information.
Moreover, to satisfy industry requirements, an MPEG player must be configurable to display both absolute and relative SMPTE timecodes, that are accurate to the video frame. Even though an MPEG browse file may contain an absolute timecode in the GOP header, it is the relative timecode that is needed for building EDLs. Conventional MPEG players access the presentation timestamp (PTS) in the program elementary stream (PES) headers to calculate elapsed time. However, this PTS is not the true SMPTE drop-frame time, expressed in SMPTE HH:MM:SS:FF format, where xe2x80x9cFFxe2x80x9d indicates a frame number. Thus, the PTS must be converted to SMPTE, which requires the player to be cognizant of the frame rate and the frame counting mode, which may not be correctly set in the MPEG stream. Additionally, the PTS value is not accurate, since it is a snapshot of the system clock reference (SCR), which is started a few hundred milliseconds prior to the first frame.
Although there are vendor encoders that place timecode data in user data fields, these data are proprietary. Furthermore, an encoder with this timecode insertion feature may not offer optimum compression. Moreover, the conventional encoders fail to address the need for inserting SMPTE timecode into MPEG encoded files, created by any vendor MPEG encoder, for the purpose of obtaining a frame accurate timecode identification. Additionally, any technique used to encode timecode information must ensure that the timecode data can be extracted by the MPEG decoder when operating in trick mode, or when randomly accessing the video file. Also, no prior art system has provided a method of processing MPEG files to embed video frame timing data in a manner that does not alter the original presentation timing, while ensuring error-free decoding.
Therefore, a need exists for the post-encoding insertion of absolute and relative, frame accurate, timecodes into MPEG files in each frame, wherein the timecodes are the true SMPTE drop-frame timecodes expressed in HH:MM:SS:FF format. It is also desirable to encode both absolute and relative timecode into a proxy file on a frame basis, which will accurately reflect the timecodes of the associated high-resolution video file.
The foregoing and other objects, features, and advantages of the present invention will be apparent from the following detailed description of the preferred embodiments which makes reference to several drawing figures.
One preferred embodiment of the present invention is a method of processing a previously encoded MPEG video file for frame accurate timecode identification of each individual video frame. The method has the following steps:
(a) for each video frame of the MPEG video file, creating a compressed timecode packet having an identifying signature, an absolute timecode of the frame, a relative timecode of the frame, a picture type and a picture reference, wherein the timecodes having the SMPTE timecode format HH:MM:SS:FF; and
(b) modifying the MPEG video file by inserting in a header of each video frame of the MPEG video file the corresponding compressed timecode packet, while maintaining the MPEG video file""s original frame presentation timing,
thereby preserving the MPEG compliance and compressed audio/video data of the video file.
The timecode packet is automatically inserted in a user data packet of the video frame, between a picture start header and a first slice header. The step of inserting the timecode packet preferably includes a step of periodically removing the MPEG video file unused data bytes, equal in number with the inserted timecode packet bytes, for preserving the MPEG video file original size and multiplex bitrate. Alternatively, the step of inserting the timecode packet includes a step of increasing the MPEG video file original multiplex bitrate, to compensate for additional timecode packet bytes inserted into the MPEG video file.
Another preferred embodiment of the present invention is an apparatus implementing the above-mentioned method embodiment of the present invention.
Yet another preferred embodiment of the present invention is a program storage device readable by a computer tangibly embodying a program of instructions executable by the computer to perform method steps of the above-mentioned method embodiment of the present invention.