The present invention is in the field of video broadcasting, and pertains more particularly to methods and apparatus for multiplexing separately-authored metadata for coordination with a main video data stream.
With continuing development of new and better ways of delivering television and other video presentations to end users, and parallel development of computerized information systems, such as the Internet and the associated World Wide Web (WWW), there have been concerted efforts to integrate various systems to provide enhanced information delivery and entertainment systems. For example, developers are introducing integrated systems combining TVs with computer subsystems, so a TV may be used as a WEB browser, or a PC may be used for enhanced TV viewing.
In some systems computer elements, such as a CPU, memory, and the like, are built into the familiar chassis of a TV set. In such a system, the TV screen becomes the display monitor in the computer mode. In such a system, conventional TV elements and circuitry are incorporated along with the computer elements, and capability is provided for a user to switch modes, or to view recorded or broadcast video with added computer interaction. One may thus, with a properly equipped system, select to view analog TV programs, digital TV programs, conventional cable TV, satellite TV, pay TV from various sources, and browse the WWW as well, displaying WEB pages and interacting with on-screen fields and relational systems for jumping to related information, databases, and other WEB pages. The capabilities are often integrated into a single display, that is, one may view a broadcast presentation and also have a window on the display for WEB interaction.
In some other systems, computer elements are provided in an enclosure separate from the TV, often referred to in the art as a set-top box. Set-top box systems have an advantage for providers in that they may be connected to conventional television sets, so end users don""t have to buy a new TV along with the computer elements.
In such integrated systems, whether in a single enclosure or as set-top box systems, user input is typically through a hand-held device quite similar to a familiar remote controller, usually having infra-red communication with the set-top box or a receiver in the integrated TV. For computer modes, such as WEB browsing, a cursor is displayed on the TV screen, and cursor manipulation is provided by buttons or other familiar pointer apparatus on the remote. Select buttons are also provided in the remote to perform the familiar function of such buttons on a pointer device, like a mouse or trackball more familiar to computer users.
Set-top boxes and computer-integrated TVs adapted as described above typically have inputs for such as a TV antenna (analog), cable TV (analog or digital), more recently direct-satellite TV (digital), and may also connect to video cassette recorders and to mass storage devices such as hard disk drives and CD-ROM drives to provide a capability for uploading video data from such devices and presenting the dynamic result as a display on the TV screen.
The present inventors have noted that with the coupling of computer technology with TV, many capabilities familiar to computer users have been made available to TV users. For example, ability to provide text annotation for TV presentations is considerably enhanced. Computer techniques such a Pix-on-Pix are now available, wherein separate TV presentations may be made in separate windows, or overlaid windows on the display screen. Separate windows may also support display from separate sources, such as an analog TV program in one window, a computer game in another, and a video conference in a third.
With the technologies described above becoming more available in the market place, it has become desirable to further integrate the technologies described so that a user viewing a video presentation might be enabled to gather additional information about a specific image entity or entities portrayed in a video through interactive method. An ultimate goal is to provide a means for advertisers to promote and sell products through user interaction in a way that minimizes steps required by such a user to access additional information regarding traditionally advertised products such as through commercials and the like.
In typical prior art video authoring systems, end users receive a single video stream that contains the video data and any added annotated data such as subtitling, sponsor logos, information blocks, and the like. However, it is desirable to build upon the goal stated in the preceding paragraph above, by having separate streams, one containing video data, and the other containing annotative data, that may arrive at and end user""s location via different delivery media and be displayed synchronously on a suitable display screen.
An authoring system, known to the inventor, may provide image tracking coordinates along with various further annotation, and may deliver separate streams via separate carriers to an end user. Also known to the inventor is a system for providing a means of applying a signature and associative frame identification to the separate streams respectively before broadcast so that both streams may later be re-synchronized at the user""s end. Such a system is likewise described under the cross-referencing section.
In current art commercial programming, various companies may purchase advertising blocks or time slots from a content provider. The content provider then edits-in such commercials to the appropriate slots before broadcasting. Typically, such commercial ads may be local to an area of broadcast and are limited in profiling to those general demographics associated with a range or geography of local viewers. For example, in a broadcast football game, commercials may be geared to appealing to a general profile of a sports fan. For a cable channel carrying exclusively women""s programming, advertisements would be geared more toward women in general. The profiling or focusing of advertisement a company can do is thus quite limited.
A system known to the inventors and disclosed in this patent application under the sub-heading below titled xe2x80x9cPersonalized and Interactive Ad System/Networkxe2x80x9d provides in one embodiment an Internet-connected subscription server running an ad-engine in the form of a software application that has ability to select video ads according to user profile and to stream such ads to a user along with a main video data stream. In some cases the ads are interactive. In systems wherein the main video and such video ads are sent by a common carrier, such as an Internet connection, the ads are inserted in the main video stream in the form of video metadata.
It is desirable that that more than one authoring station or system may be used when creating metadata for delivery to an end user, because there are a variety of functions that may be implemented through metadata. For example, it is desirable that separate authoring stations will be used in hyper-video authoring, such as in providing object tracking coordinates, creating hot spots (hyperlinks) in a video, providing interactive regions for tracked objects, inserting URL""s, providing review markers by scene authoring, and so on. Scene authoring based on scene-change-detection-technology (SCDT) has several purposes, such as providing thumbnails as bookmarks for users to select and review particular portions of video presentations, and for markers for ad insertion or insertion of other information. In addition, separate ad information may be authored by yet additional authors and provided as metadata for delivery to an end user.
While combination of live video and live annotation streams is treated herein concerning hyper-video authoring and delivery, it is generally understood that in the live case, annotation streams may be timed to run in sync alongside or over a main video steam. This process is performed at the provider""s end. However, the presence of possibly two or more separately-authored annotation data-sets wherein the method of delivery is not necessarily in real time, requires a more comprehensive approach.
What is clearly needed is a method and apparatus for merging separately-authored sets of metadata such that the metadata is associated appropriately to a correct frame location in a main video. Such a method and apparatus would serve to ad flexibility to the authoring process and to simplify delivery methods.
In a preferred embodiment of the present invention an authoring system for interactive video, comprising a video feed providing a main video presentation stream; two or more authoring stations coupled to the video feed providing authoring functions creating metadata for enhancing the main video stream; and a multiplexer for coordinating authored metadata with the main video stream. The authoring stations may note a presentation time stamp (PTS) of video frames or any other time stamp and incorporate it in the authored metadata for matching the metadata with the main video presentation stream.
In various embodiments there is a multiplexer for combining authored metadata with the main video data stream, and the multiplexer places the metadata in relation to the main video data stream according to the PTS. The multiplexer in some cases receives multiple video streams as well as the authored metadata, and time clocks are monitored for separate stream sources and clocks are adjusted to compensate for real-time differences in sources. One or more of the stream sources may be from a stored source. In some embodiments PTS values are rewritten in one or more streams to compensate for perceived time differences. Also in some embodiments PTS-enhanced metadata is streamed over the Internet to an end user, and in others the PTS-enhanced metadata is inserted into video blanking intervals (VBI) of an analog stream according to the PTS. In still other embodiments the PTS-enhanced metadata is stored to be downloaded as needed by a user.
In various embodiments of the invention the authoring stations may include one or more of scene authoring, hyper-video authoring, and ad authoring stations. At the user end the user system is enhanced with software for displaying the main video data stream and the authored metadata according to the PTS.
In another aspect of the invention a method for coordinating authored video metadata with a main video data stream is provided, comprising steps of (a) ensuring the main video data stream has a presentation time stamp (PTS); (b) feeding the digital main video data stream to authoring stations; (c) authoring matadata at the authoring stations; and (d) marking the metadata with presentation time stamps (PTS) from the main video data stream.
This method may further comprise a step for multiplexing authored metadata with the main video data stream, wherein the multiplexer places the metadata in relation to the main video data stream according to the PTS. There may also be multiple sources of video fed to the multiplexer as well as multiple metadata streams for a video, and a step as well for compensating for real-time differences between the multiple sources. In the compensating step, presentation time stamps (PTS) may be amended according to source time differences.
In some cases the PTS-enhanced metadata is streamed over the Internet to an end user. In other cases the PTS-enhanced metadata is inserted into video blanking intervals (VBI) of an analog stream according to the PTS. In still other cases the PTS-enhanced metadata is stored to be downloaded as needed by a user. The authoring stations may include one or more of scene authoring, hyper-video authoring, and ad authoring stations, and analog streams may be accommodated in some embodiments by conversion to a digital format before authoring and multiplexing, and in others by integrating a PTS with the analog stream. Also, at the final user""s end, there is software for rendering the main video data stream and authored metadata according to PTS.
In yet another aspect of the invention a digital video multiplexing system is provided comprising metadata inputs from video authoring stations; an input for a main digital video data stream; and an output to a video transport interface. The multiplexer notes presentation time stamps associated with authored metadata, and places the authored metadata relative to the main video data stream for transport to end users. The multiplexing system may have multiple video data stream inputs, and one or more of the inputs may be from a stored source. There may also be multiple video data stream inputs from multiple sources, and the multiplexer monitors real time clocks of the sources and uses the information to compensate one or both of the multiple streams. In case of real-time differences the multiplexer compensates incoming streams by buffering one or more of the streams. The multiplexer may also compensate incoming streams by amending the presentation time stamps of one or more of the streams.
In embodiments of the invention as taught in enabling detail below, for the first time it is possible to annotate one or more main video streams, either analog or digital streams, and to enhance the streams with authored metadata in a manner that multiple inputs may be made and fully coordinated to be completely useful when finally delivered to the end user, and many interactive functions not previously known in the art are provided.