The present invention relates to a video tracking and overlay system controlled by video input.
Input devices for computing systems have not been investigated to the same degree as output devices. In many ways, the traditional keyboard from decades ago remains the primary means of entering user input into a computer. The advent of the mouse, joystick and touch-screens has augmented keyboard input but still the vast majority of input data to the computer is done by keyboard. All of these devices are disadvantageous because they define only a limited set of input data that can be entered into the computer. The input is tied to a predetermined syntactic context. For example, a modern computer keyboard may include 101 keys. These keys may be used only in a finite number of combinations thus limiting the amount of data that can be entered into the computer. In the last few years, however, microphones and video cameras have begun to be shipped with new computers, enabling a fundamental change in how computers can perceive the world.
In modern computers, camera are becoming ubiquitous thanks in large part to the proliferation of video conferencing and imaging applications. Most video processing applications involve the capture and transmission of data. And, accordingly, most video technologies for the PC reside in codecs, conferencing, and television/media display. The amount of intelligent, semantic-based processing applied to the video stream typically is negligible. Further, there has been very little done to integrate semantic-based processing with computer operation.
There exists a need in the art for a human-machine interface that shifts away from the literal, xe2x80x9ctouchxe2x80x9d-based input devices that have characterized computers for so long. Humans view the world associatively through visual and acoustical experience. The integration of video cameras and microphones now enable the computer to perceive their physical environments in a manner in which humans already do. Accordingly, computer that perceive their environment visually will start to bridge the perceptual gap between human beings and traditional computers.
Also, in modern computing systems, interactive displays increasingly are becoming more common. xe2x80x9cInteractive displaysxe2x80x9d refer generally to a class of devices in which a viewer of the display may control at least a portion of information presented by the display. Display data may be organized into layers of data which are selectively activated. The interactive display, therefore, may receive layers of audiovisual data that may be displayed to a user on a selective basis. A base layer of data may include data that is continually present to a user unless obscured by metadata. Metadata refers generally to ancillary or supplementary data that a user may select for display. The form, format and content of the base layer data and metadata, of course, depends upon the applications for which the interactive display is used.
Interactive displays have broader application than the traditional PC-style of computer. In fact, some believe that PC-style of computers may converge upon traditional domestic television services, wherein television viewers may interact with programming content. Consider, by way of example, a sports broadcast. In conventional sports programming, when play focuses on a particular player, it is conventional for broadcast networks to superimpose printed statistics relating to the player""s performance. An interactive display might permit a viewer to determine when (or if) to display a player""s statistics through a selection and command process that resembles the xe2x80x9cpoint and clickxe2x80x9d of traditional PC graphical selection techniques.
As is known, viewers favorably receive interactive display systems that are intuitive and easy to use. They will avoid any system that is cumbersome or requires excessive training before the system may be used for its intended purpose. Further, particularly in the field of television viewing, viewers will not tolerate interactive display controls that are tethered to a control console through a cable or the like. Accordingly there is a need in the art for an interactive display that is easy to use, one that does not require training or manipulation of complicated remote devices.
Embodiments of the present invention provide a video overlay method that identifies a token from input video data and resolves the token""s position in the input video data to a position in output video data. The method determines whether the token""s resolved position implicates metadata. If so, the method includes the metadata in the output.