The present description relates generally to a device that determines added graphics layers within a video image signal. Such a device is useful for example in the field of image content replacement, in which an apparatus detects a target area in one or more regions of an image, and which may replace the target area with alternate content. In some examples, a dynamic image content replacement system is described that is suitable for use with live television broadcasts.
In the related art, one or more target areas within a video image signal are defined and then replaced with alternate images appropriate to specific viewer groups or geographical regions. For example, billboards at a ground or arena of a major sporting event are observed as part of a television broadcast, and these target areas are electronically substituted by alternate images that are more appropriate for a particular country or region. In particular, such a system is useful to create multiple television feeds each having different electronically generated advertisement content which is tailored according to an intended audience. For example, a number of feeds are produced each having differing content (e.g. a billboard in the original images is modified to carry advert 1 for country 1, while advert 2 is added for region 2, and so on). This situation is particularly relevant for an event of worldwide interest which is to be broadcast to a large number of countries or regions and where it is desired to dynamically modify the video images appropriate to each specific audience.
A difficulty arises in that television feeds typically have multiple image layers which are mixed together. For example, camera images of a sports event are often overlaid with one or more graphics layers. These graphics layers may be used, for example, to provide additional information for the viewer, such as the broadcaster, current score, teams, athletes or various statistics. Different graphics layers may be applied at different times during a transmission or event. The graphics layers are often semi-transparent, allowing the original image also to be partly viewed. Further, various transformation functions may be applied during mixing (e.g. animations or fading of graphics layers, etc.). Thus the composite video signal after mixing is a complex combination of the original image signal with the graphics layers.
Typically, the graphics layers are provided from multiple sources and are added by a vision mixer device to form the composite video signal during a live transmission. However, gaining accurate information about those added graphics layers is difficult. For example, it has previously been necessary to separately monitor each graphics signal which is input into the vision mixer device, such as by running cables from inputs of the mixer to a monitoring station. Often, a large number of cables or connections are needed, considering that each graphics layer may need two signal inputs (often termed the ‘fill’ and ‘key’), and the required number of connections may exceed a capacity of the monitoring station (i.e. the monitoring hardware has only a finite number of inputs). Connecting the monitoring cables is sometimes intrusive and labour-intensive. The monitoring cables are subject to being misconnected, or may become incorrectly assigned following changes at the mixer device. The cables sometimes suffer damage during use (e.g. a cable break), which is highly disruptive during a live transmission. Further, the monitoring station typically needs to be precisely calibrated relative to each graphics layer which is to be used during transmission (e.g. precisely aligning the content of each graphics layer as inputs to the vision mixer compared with how those image components appear in the produced composite video signal).
Considering the related art, there is still a difficulty in providing a reliable and effective device for determining one or more graphics layers which have been included within a composite video image. Also, it is desired to be able to dynamically modify the composite video image signals in a way which is accurate and photo-realistic for the viewer, which would be enhanced by determining the added graphics layers at each moment in time. Further, there is an ongoing desire to improve the flexibility for configuring the system, so that the system may be installed and commissioned more readily alongside other existing video processing equipment, which may well be owned or operated by different parties.
It is now desired to provide an apparatus and method which will address these, or other, limitations of the current art. As will appreciated from the discussion herein, at least some of the example embodiments allow graphics layers within a composite video to be detected or derived indirectly, i.e. without receiving an explicit definition of the added graphics layers. Further, in some examples, many of the other difficulties of the previous approaches are also alleviated.