1. Field of the Invention
The invention relates to video signal processing. In particular, the invention relates to methods, computer programs and apparatuses for detecting the presence of additional graphics in a video signal.
2. Description of the Related Art
Today, it is common for television broadcasts to add information in the form of graphics to the video signal(s) being shot by the television camera(s). For example, television broadcasts of sports events are typically produced so that additional information in the form of graphics is provided to the viewer to help him/her to understand the game. This additional information is added on top of the video images, and the additional information may be e.g. a game clock, a logo of the broadcasting television station (typically added to the upper right corner), a player's name, etc.
In a broadcast environment, the additional graphics are typically formed by a graphics system and mixed by a vision mixer. A vision mixer (also called video switcher, video mixer or production switcher) is a device that mixes different video sources to output feeds. Typically a vision mixer can be found in a professional television production environment such as a television studio, a cable broadcast facility, a commercial production facility, a remote truck/outside broadcast van (OB van), or a linear video editing bay. E.g. a vision engineer located in an outside broadcast van is listening to instructions from a director of a sports broadcast and selecting the camera to be shown. The director also instructs the vision engineer to turn certain graphics on or off in order to produce an aesthetic experience for the viewer and/or to make the program more informative to the viewer. For example, the game clock is typically hidden during replay after a goal, and the name of a player is displayed together with the image of the player.
The vision mixer gets different inputs, such as camera inputs, recorder inputs, and feeds from a graphics system. These inputs to the vision mixer are typically independent of the director's instructions. For example, on one of the inputs, the clock is on all the time.
Typically, graphics to be added on top of the video images is partly transparent, and the graphics is arranged on top of the video images by means of two auxiliary signals: a graphics signal and a mask signal. The graphics signal includes the graphics to be added (such as a logo of a broadcasting station, the name of a player in a sports broadcast, or a game clock), and the mask signal (also known as a key signal) defines the transparency (also known as alpha) of pixels of its associated graphics signal. A mask signal is typically a monochrome signal, where completely black areas correspond to completely transparent, and completely white areas correspond to opaque. Areas between completely black and completely white correspond to various degrees (e.g. in percentages) of transparency. Typically but not always, each graphics signal/mask signal—pair corresponds to a single graphics t, and there may be several of these graphics signal/mask signal—pairs per one video signal. At any given time, one or more of these graphics signal/mask signal—pairs may be mixed on or off the video signal (e.g. by the vision engineer using the vision mixer), so that the corresponding graphics will or will not be visible as required. In the art, the video signal without the added graphics is often called a clean feed, and the video signal with the added graphics is called a dirty feed.
Typically, a prior art vision mixer outputs only the clean feed and the dirty feed, and at least the dirty feed is then forwarded in the broadcast signal transmission chain until it finally reaches the viewers. That is, prior art vision mixers are not configured to output any specific information about which combination of the input graphics signals is on (i.e. mixed into the clean feed to create the dirty feed) at any given time. The presence of transparency (i.e. the mask signals) means that one cannot just compare the clean feed and the dirty feed pixel-by-pixel to try to determine the added graphics based on the differences, since it is not trivial to determine the color and the transparency of a pixel in such a case.
Yet, there are situations in which it would be useful to be able to detect which graphics are added to a video signal, such as a television broadcast signal at any given time. For example, the present applicant's earlier patent application WO 2009/074710 describes a method for modifying the content of a television image by inserting substitutive content into specific areas of a television image. Information about which graphics are added to the television image at any given time facilitates such insertion of the substitutive content.
As described above, to obtain this information about which graphics are added to the video signal at any given time, in prior art one has had to e.g. modify conventional vision mixers so that they can provide this information. However, this is a major disadvantage since it requires the owner of the vision mixer to do this, and modifications to expensive existing systems are risky.
Therefore, an object of the present invention is to alleviate the problems described above and to introduce a solution that allows detecting which graphics are added to the video signal at any given time by utilizing only the various signals provided by e.g. a vision mixer and/or a graphics system, i.e. without requiring any modifications to existing conventional hardware.