The present invention is in the field of video broadcasting and editing and pertains more particularly to methods and apparatus for receiving separate video and video enhancement data-streams from different sources and combining them to be displayed synchronously.
With continuing development of new and better ways of delivering television and other video presentations to end users, and parallel development of computerized information systems, such as the Internet and the associated World Wide Web (WWW), there have been concerted efforts to integrate various systems to provide enhanced information delivery and entertainment systems. For example, developers are introducing integrated systems combining TVs with computer subsystems, so a TV may be used as a WEB browser, or a PC may be used for enhanced TV viewing.
In some systems computer elements, such as a CPU, memory, and the like, are built into the familiar chassis of a TV set. In such a system, the TV screen becomes the display monitor in the computer mode. In such a system, conventional TV elements and circuitry are incorporated along with the computer elements, and capability is provided for a user to switch modes, or to view recorded or broadcast video with added computer interaction. One may thus, with a properly equipped system, select to view analog TV programs, digital TV programs, conventional cable TV, satellite TV, pay TV from various sources, and browse the WWW as well, displaying WEB pages and interacting with on-screen fields and relational systems for jumping to related information, databases, and other WEB pages. The capabilities are often integrated into a single display, that is, one may view a broadcast presentation and also have a window on the display for WEB interaction.
In some other systems, computer elements are provided in an enclosure separate from the TV, often referred to in the art as a set-top box. Set-top box systems have an advantage for providers in that they may be connected to conventional television sets, so end users don""t have to buy a new TV along with the computer elements.
In such integrated systems, whether in a single enclosure or as set-top box systems, user input is typically through a hand-held device quite similar to a familiar remote controller, usually having infra-red communication with the set-top box or a receiver in the integrated TV. For computer modes, such as WEB browsing, a cursor is displayed on the TV screen, and cursor manipulation is provided by buttons or other familiar pointer apparatus on the remote. Select buttons are also provided in the remote to perform the familiar function of such buttons on a pointer device, like a mouse or trackball more familiar to computer users.
Set-top boxes and computer-integrated TVs adapted as described above typically have inputs for such as a TV antenna (analog), cable TV (analog or digital), more recently direct-satellite TV (digital), and may also connect to video cassette recorders and to mass storage devices such as hard disk drives and CD-ROM drives to provide a capability for uploading video data from such devices and presenting the dynamic result as a display on the TV screen.
The present inventors have noted that with the coupling of computer technology with TV, many capabilities familiar to computer users have been made available to TV users. For example, ability to provide text annotation for TV presentations is considerably enhanced. Computer techniques such a Pix-on-Pix are now available, wherein separate TV presentations may be made in separate windows, or overlaid windows on the display screen. Separate windows may also support display from separate sources, such as an analog TV program in one window, a computer game in another, and a video conference in a third.
With the technologies described above becoming more available in the market place, it has become desirable to further integrate the technologies described so that a user viewing a video presentation might be enabled to gather additional information about a specific image entity or entities portrayed in a video through interactive method. An ultimate goal is to provide a means for advertisers to promote and sell products through user interaction in a way that minimizes steps required by such a user to access additional information regarding traditionally advertised products such as through commercials and the like.
In typical prior art video authoring systems, end users receive a single video stream that contains the video data and any added annotated data such as subtitling, sponsor logos, information blocks, and the like. However, it is desirable to build upon the goal stated in the preceding paragraph above, by having separate streams, one containing video data, and the other containing annotative data, that may arrive at and end user""s location via different delivery media and be displayed synchronously on a suitable display screen.
Although the inventor knows of an authoring system that may deliver separate streams via separate media, as described above with respect to co-pending patent applications listed under the Cross-Reference to Related Documents section and provided herein as reference, a problem exists with respect to the unpredictable nature of latency conditions inherent to separate media networks that may be chosen to deliver such data streams.
A typical broadcast system may experience a variable latency rate in the broadcast of a video stream of up to several hundred milliseconds. This latency, defined as a variable delay period of signal transmission from the point of broadcast to the end point (end user), is experienced at the users end. Quality of lines, connections, and other interferences may affect such latency conditions.
Internet delivery systems which transmit data using switched-packet-technology also experience unpredictable latency problems, similar to that described above, as well as competition from a host of other data transfer events due to the fact that, generally speaking, bandwidth must be shared. While measures may be taken at the user""s end to improve downloading capabilities such as employing a better modem or using an integrated services digital network (ISDN) connection, unpredictable latency is still a problem.
Because the latency factor regarding such delivery or broadcast methods cannot be reliably predicted, the prospect of sending separate data or video streams over different networks and then re-synchronizing them to be displayed as one stream on a user""s display system is a formidable challenge.
What is clearly needed is a method and apparatus that would allow a user receiving two separate data-streams via separate and unrelated delivery systems to re-synchronize and combine the streams into one stream, containing both the video data and the annotation data, for the purpose of displaying the combined and synchronous stream on a suitable display monitor for viewing. Such a method and apparatus would also allow product advertisers more option with regards to personalizing advertisements for target end users.
In a preferred embodiment of the present invention a system for marking a first data stream relative to a second data stream is provided, the two streams synchronized, comprising a pipeline having an input for each data stream; a reader noting selected data in the pipeline from the second data stream; and a writer writing the selected data to the first data stream in the pipeline.
In a preferred embodiment the selected data comprises numbers identifying frames in the second data stream, and there may also be timing markers placed by the writer in the first data stream. Typically the first data stream is a live video data stream and the second data stream is an annotation data stream authored in synchronization with the first data stream. The annotation data stream may include tracking data derived from tracking an entity in the first data stream.
In some cases the numbers identifying frames in the second data stream are written into vertical blanking intervals (VBI) in the first data stream. In other instances the numbers identifying frames in the second data stream are written into horizontal blanking intervals (HBI) in data for individual frames in the first data stream. In still other instances the numbers identifying frames in the second data stream are coded into the first data stream by altering pixel data for at least one pre-selected pixel in one or more frames, and wherein the numbers identifying frames are associated with timing marks in the first data stream.
Timing marks in some cases are at intervals of a number of frames by convention. In others the timing marks are binary numbers inserted into the first data stream by the writer. The timing marks are, in some cases by convention, scene changes in the first data stream.
In another aspect of the invention a method for marking a first data stream relative to a second data stream, while the streams are in synchronization, for later synchronizing the two data streams when out of synchronization is taught, comprising steps of (a) entering the two data streams in a pipeline; (b) noting selected data in the pipeline from the second data stream; and (c) writing the selected data to the first data stream in the pipeline. In step (b), the selected data may comprise code identifying color pixels in the video data stream, or numbers identifying frames.
In some embodiments the first data stream is a live video data stream and the second data stream is an annotation data stream authored in synchronization with the first data stream. The annotation data stream may include tracking data derived from tracking an entity in the first data stream.
In another aspect of the invention a system for synchronizing a first data stream with a second data stream is provided, comprising a first controllable dynamic buffer reading the first data stream for inserted frame identifiers identifying frames from the second data stream to be displayed with frames from the first data stream to accomplish synchronization; a second controllable dynamic buffer reading frame identifiers in the second data stream; and a control module controlling the dynamic buffers, adjusting the relative position of the two dynamic streams to accomplish synchronization according to the data read from the two data streams.
In a preferred embodiment the frame identifiers read from the first data stream identifying frames from the second data stream are binary numbers read from vertical blanking intervals (VBI) in the first data stream. In an alternative embodiment the frame identifiers read from the first data stream identifying frames from the second data stream are binary numbers read from horizontal blanking intervals (HBI) in the first data stream. In yet another embodiment the frame identifiers read from the first data stream identifying frames from the second data stream are binary numbers decoded from pixel data in one or more frames in the first data stream.
In some embodiments timing markers associated with the frame data are read from the first data stream, and the control module utilizes the timing markers in adjusting the relative positions of the data streams. Relative positioning of the data streams is accomplished by delaying one or the other of the data streams, such as by repeating frames in the stream to be delayed. Preferably adjustment is made gradually toward synchronization each time an adjustment is made.
In another aspect a method for synchronizing a first data stream with a second data stream is provided, comprising steps of (a) reading the first data stream in a first controllable dynamic buffer for inserted frame identifiers identifying frames from the second data stream to be displayed with frames from the first data stream to accomplish synchronization; (b) reading frame identifiers from the second data stream in a second controllable dynamic buffer; and (c) controlling the dynamic buffers by a control module, adjusting the relative position of the two dynamic streams to accomplish synchronization according to the data read from the two data streams.
In this method, in step (a), the frame identifiers read from the first data stream identifying frames from the second data stream may be binary numbers read from vertical blanking intervals (VBI) in the first data stream. In another embodiment the frame identifiers read from the first data stream identifying frames from the second data stream may be binary numbers read from horizontal blanking intervals (HBI) in the first data stream. In yet another embodiment the frame identifiers read from the first data stream identifying frames from the second data stream may be binary numbers decoded from pixel data in one or more frames in the first data stream. Further, timing markers associated with the frame data may be read from the first data stream, and the control module may utilize the timing markers in adjusting the relative positions of the data streams.
In the various aspects of the invention, taught in enabling detail below, for the first time apparatus and methods are provided allowing data streams to be marked while operated in synchronization, and to then be delivered by different networks having different latency effects, such that the streams are not synchronous as received, but may be re-synchronized using the marks provided while the streams were synchronous.