The present invention relates generally to overlaying objects in a video, more particularly, to autonomic positioning of overlays within streaming data.
In an analog world where video pictures are transmitted in analog fashion and carrier waveform, the moving picture is transmitted as a sequential set of “fields” describing a static image to be painted on the screen by a receiver device. The image to be rendered by the receiver is therefore “flat” in the sense that it is a collection of pixels without any meaning. There is no concept of layering nor of objects that might be manipulated by the receiver in some fashion. The generation of secondary objects to be displayed as an overlay of the picture is done at the transmitting end and incorporated into a transmitted picture as an integral part of the transmitted picture. The receiver device in this scenario cannot choose to alter the overlaid object in any fashion whatsoever.
Analog video broadcasting developed the ability to transmit an embedded stream within the broadcasting video picture as a mechanism to transmit textual representations of the spoken words or sounds contained in the broadcast for the benefit of the deaf and hard of hearing. Receivers enabled to decode and display the contents of this embedded stream allow the viewer to toggle on or off the display of the captioned information. The positioning of the overlaid object is not under the control of the viewer or the receiver; rather, the positioning is encoded at the source and is part of the data stream.
Within the realm of analog video broadcasts, the concept of Picture-in-picture (PiP) has also been developed. In order to achieve this functionality, two tuner mechanisms are needed to present the information to specially designed receivers. The output of one tuner is displayed by the receiver in a full screen and the secondary tuner's output is displayed as an overlaid picture on top of the primary picture. In this case, the receiver usually enables the viewer to select positioning of the PIP window within a set of pre-selected positions on the screen.
In the digital world, video pictures are transmitted as a data stream within the frame of a device or program (CODEC) capable of performing transformation of a data stream or signal and interpreting instructions within the data stream to present a displayed object. Some CODECs support the embedding of one secondary data stream within a primary data stream. In this case, the embedding of a secondary data stream contains an object to be displayed within the frame of the primary display, and usually contains positioning information for the object as well.