The present invention relates generally to generating hypermedia documents, and more specifically to automatically creating hypermedia documents from conventional transcriptions of television programs.
Today several broadcasters are publishing transcriptions of their television programs on their web sites. Some manually augment the transcripts to include still images or audio clips (e.g. www.pbs.org, www.cnn.com). However, the amount of manual labor required to generate these hypermedia documents limits the number of programs that can be converted to web content. Method useful in generating pictorial transcripts are disclosed in a patent application entitled xe2x80x9cMethod for Providing a Compressed Rendition of a Video Program in a Format Suitable for Electronic Searching and Retrieval,xe2x80x9d U.S. Pat. No. 6,098,082, filed Jul. 16, 1996, and xe2x80x9cMethod and Apparatus for Compressing a Sequence of Information-Bearing Frames Having at Least Two Media Components,xe2x80x9d U.S. Pat. No. 6,271,892,-B1the disclosures of which are incorporated herein by reference in their entirety.
A method for converting closed captioned video programs into hypermedia documents automatically within minutes after the broadcast of the program is described in Shahraray B., and Gibbon, D., xe2x80x9cAutomated Authoring of Hypermedia Documents of Video Programsxe2x80x9d, Proc. Third Int. Conf. on Multimedia (ACM Multimedia ""95), November 1995. However, the resulting quality of the pictorial transcript is a function of the level of skill of the closed caption operator and there are many errors of omission, particularly during periods of rapid dialog. Further, since the caption is typically transmitted in upper case, an automatic case restoration process must be performed. This process is complex since it requires dynamically updated databases of proper nouns, as well as higher level processing to handle ambiguous cases. Conventional transcripts of television programs however, are of higher quality since the time has been taken to assure that the dialog is accurately represented, and of course, case restoration is unnecessary.
The present invention is an apparatus, method and computer program product for producing an enriched time-referenced text stream using a time-referenced text stream and an enriched text stream. The method includes the steps of receiving the time-referenced text stream and the enriched text stream; aligning the text of the enriched text stream with the text of the time-referenced text stream; and transferring time references from the time-referenced text stream to the enriched text stream based on the alignment to produce an enriched time-referenced text stream. In one embodiment, the time-referenced text stream is a closed-captioned text stream associated with a media stream the enriched text stream is a transcript associated with the media stream.
The method further includes the steps of receiving a multimedia stream; extracting the closed-captioned text stream from the multimedia stream; receiving a portion of a media stream of the multimedia stream; and linking a portion of the enriched time-referenced text stream with the portion of the media stream based on the time references to produce a hypermedia document.
In one embodiment, the method includes the steps of receiving a user request to generate a hypermedia document; and generating a hypermedia document in response to the user request using a selected template. The selected template can be specified by the user.
Further features and advantages of the present invention, as well as the structure and operation of various embodiments of the present invention are described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Additionally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.