The present invention relates to virtual insertion systems for television video and, more particularly, to a xe2x80x9cmidlinkxe2x80x9d system which enables the virtual insertion system to be positioned downstream of the originating site in the chain of distribution of a television program.
The term virtual insertion system is used herein to describe systems which replace, or insert in place of, in a video sequence of a scene (i.e., as obtained by a video camera or recorder), a target region or area in the video image by a matched replacement pattern adapted to be inserted into the target region, such as representation of a model stored in memory. For example, the video sequence may be of a soccer match or a stock car race wherein commercial billboards are part of the scene. The virtual insertion process involves replacement of the xe2x80x9ctarget,xe2x80x9d i.e., a particular billboard advertising a first product, in the scene with a representation of a different billboard advertising a second product so that in using existing techniques, this way a different commercial product is advertised in the scene. This can be done in such a way that the substituted portion fits substantially seamlessly into the basic image so as not to be noticeable to a viewer as a replacement or substitute.
Briefly considering existing virtual insertion systems, a representation of the target, i.e., the selected part of the scene intended for replacement or addition, is memorized, i.e., stored in memory. The position, size and perspective of the target are computed. The stored pattern is then transformed geometrically according to the estimated size and perspective of the corresponding target in the current scene image. The pattern representation is also modified in accordance with the radiographic properties of the target. Finally, the transformed pattern is inserted into the current scene image to replace the target. It will be understood that the transformed pattern need not be a sample image but can instead be a two-dimensional or three-dimensional graphic element (which may or may not be animated) Systems of this general type are disclosed, for example, in U.S. Pat. No. 5,264,933 (Rosser), U.S. Pat. No. 5,363,392 (Luquet et al), U.S. Pat. No. 5,436,672 (Medioni et al) and U.S. Pat. No. 5,543,856 (Rosser) as well as French Patent No. 94-05895 to which patents reference is made for a more complete description of the virtual insertion process and the subject matter of which patents is hereby incorporated by reference.
There are two basic types of virtual insertion systems, instrumented camera systems and image recognition systems. The process used to obtain an estimation of the position, size and perspective of a target depends on whether the camera is instrumented or not. In an instrumented system, sensors are used to measure the camera operating parameters such as pan, tilt, focus and zoom, and the location, size and perspective of the target are determined from the sensor outputs. If the cameras are not instrumented and thus information from sensors is not available, an image recognition system is used to detect and track the relevant area or areas of the current scene images in order to obtain the required parameters and the area or areas are replaced in real time.
Referring to FIG. 1, the typical chain of distribution of a television program is indicated in a schematic manner. A plurality of cameras 10 are focussed on a scene S and transmit what is referred to as a xe2x80x9cclean cleanxe2x80x9d feed, i.e., a feed without graphics or special effects, to a mobile control room or van (xe2x80x9ctruckxe2x80x9d) 12 which is generally located at the venue. Control room 12, which is generally located at the venue, i.e., at the site of the event, selects the image that will be broadcast, using a multiplexer or switcher unit. The multiplexer unit also generates a coded signal, referred to as a xe2x80x9ctallyxe2x80x9d signal or xe2x80x9ctally,xe2x80x9d to identify the specific camera being used to produce that particular image. For economic and aesthetic reasons, only certain broadcast cameras are instrumented with sensors and the tally closure of the cameras reflects which camera is active or on air at any given time. Signals can also be generated which reflect whether a given graphic layer or special effect is in use at any given time. In the terminology generally used, a xe2x80x9cclean-clean feedxe2x80x9d contains only the camera signals whereas a xe2x80x9cclean feedxe2x80x9d contains one graphic layer and/or special effect (e.g., a slow motion replay). Using standard video equipment, the control room can add graphic layers and/or special effects to produce the final image. A so-called xe2x80x9cdirty feed,xe2x80x9d i.e., a feed containing the camera image plus all of the graphic layers, special effects, etc, is then sent to the network studio 16 via a satellite indicated at 14. The principal role of the network studio is to broadcast the images, via a satellite 17, to daughter stations 18 and these stations, in turn, broadcast the images to the public, as indicated by individual television receivers 19.
In some present commercial systems, cameras are used in a switched mode wherein image processing is carried out xe2x80x9cbeforexe2x80x9d the multiplexer or switcher. For example, with these prior art systems, the director in the mobile control room has two signals from camera A from which to choose, signal A and signal Axe2x80x2 wherein signal Axe2x80x2 is a signal from camera A which has been previously processed at the venue and which is thus delayed with respect to signal A.
There are a number of different approaches in providing virtual insertion that have been used, or are potentially useable with respect to the location at which virtual insertion takes place. A first approach, which will be referred to as an uplink monocamera system and which is illustrated in FIG. 2, concerns a system or configuration wherein the virtual insertion system is located on-site, i.e., wherein the video images (and the sensor data, if applicable,) are processed locally at the venue, i.e., are sent to the mobile control room or outside broadcaster van of the broadcaster located at the venue and processed there. This is the approach typically used in some commercial virtual insertion systems.
In FIG. 2, cameras 10a, 10b, 10c are connected to a multiplexer 31 and an image processing system 21 is located between the cameras and the multiplexer 31. It will be understood that FIG. 2 is intended to cover, the generic case, i.e., both instrumented and uninstrumented cameras, and that for instrumented cameras, both an image signal and a sensor output signal would be provided for each camera. Further, although only a single image processing system is shown, typically there would be an imaging processor for each camera. A virtual insertion device or unit 22 of the type described above replaces the relevant part, i.e., the target region, of the video image with the desired advertising pattern or the like. Again, in the commercial implementation, a separate virtual insertion unit 22 is individually associated with each camera, regardless of whether the camera is on air or not, in order to produce a different feed for use in the rest of the chain. The virtual insertion units 22 obviously must be on-site and must also be attached to each camera, where more than one camera is to be used. As mentioned above, the director in the control has the choice of two duplicate images, a xe2x80x9cclean cleanxe2x80x9d image directly from the camera and a delayed image from the camera after processing by the image processing system 21 and the virtual insertion unit 22, and the multiplexer 31 can be used to switch between the two. The multiplexer 31 is located in a mobile control room or van 30 along with standard video equipment indicated at 32. The video equipment 32 is used to add graphic layers, special effects and non-camera generated effects such as camera replay to the output images from the multiplexer 31. The images are sent to the network studio 40 and, from there, are relayed to daughter station(s) 50. Among the disadvantages of this approach are that one virtual insertion or replacement system is necessary for each camera and the virtual insertion operation must be performed on-site which requires that a relatively large number of technical people be on-site.
Referring to FIG. 3, wherein elements corresponding to those shown in FIG. 2 have been given the same reference numerals, what will be referred to as an uplink multicamera configuration is shown. In this configuration, which has been used commercially by the assignee of the present application since 1995, the virtual insertion device 22 is located in a van (e.g. an EPSIS(trademark) truck), on-site, and accepts inputs from multiple cameras (e.g. cameras 10a, 10b and 10c) and processes the xe2x80x9cclean feedxe2x80x9d of the active active, as identified by the tally signal from the mobile control room 30. In one embodiment, represented schematically in FIG. 3, a pattern recognition module of the image processing system 21 is used to determine the target area to be replaced and while, in alternative embodiments, instrumented cameras are used, and data signals from camera sensors, i.e., pan, tilt, zoom and focus signals, are sent directly from the cameras 10a, 10b and 10c to the virtual insertion device 22. The modified video stream produced by virtual insertion device or system 22 is then sent back to the mobile control room or van 30 and the video equipment 32 inserts graphic layers and special effects or, alternatively, the virtual insertion device uses graphics layers and special effects from the control room 30 to generate a new xe2x80x9cdirty feedxe2x80x9d to the network station or studio 40.
Referring to FIG. 4, a system which will be referred to as an uplink/downlink system is shown. Again, corresponding units have been given the same reference numerals as in FIG. 2. In this system, the processing required for virtual insertion is split into two parts. If image processing is to be performed, it is carried out on-site as indicated by image processing unit 21. All of the other required steps are performed at the mobile control room or van 30 except for the actual insertion. All of the information necessary to perform the insertion step (e.g., target location, occluded pixels, etc.) is encoded at the mobile control room 30 and is transmitted to the network studio 40. The virtual insertion is performed at the studio 40 or downstream thereof as indicated by the location of insertion system 22. At the daughter station(s) 50, the insertion pattern can be different for each of the daughter stations, if desired. A system of this type is disclosed in French Patent No. 94-05895, referred to above. Methods for protecting the encoded information are described in one of the above-mentioned Rosser patents (U.S. Pat. No. 5,543,856) along with a xe2x80x9cmasterxe2x80x9d xe2x80x94xe2x80x9cslavexe2x80x9d system wherein the master system does the image recognition and detection and provides information pertaining to the precise location of the inserted image and the slave system carries out the insertion operation.
In accordance with the invention, a xe2x80x9cmidlinkxe2x80x9d system is provided wherein the required input and control data is collected at the venue, i.e., on-site and transported to an off-site location at which virtual insertion is performed on the xe2x80x9cdirty feedxe2x80x9d broadcast from the venue.
In accordance with one aspect of the invention a television system is provided wherein a target region in successive video images is replaced by a matching pattern adapted to be inserted into the target region, the system comprising:
at least one television camera for producing a sequence of video images of a scene;
image broadcast processing means for receiving the video images and for selectively adding layers of graphics and special effects to the video images to produce a broadcast feed; and
virtual insertion means, located off-site from the broadcast image processing means, for receiving the broadcast feed and for modifying the broadcast feed by replacing a target portion of the video images with a replacement pattern adapted to be inserted into the target portion.
According to a further aspect of the invention, a television system is provided wherein a target region in successive video images is replaced by a matching representation pattern adapted to be inserted into the target region, the system comprising:
at least one television camera for producing a sequence of video images of a scene;
a mobile control room located on-site with said at least one camera and including broadcast image processing means for receiving said video images and for adding layers of graphics and special effects to said video images to produce corresponding video images and means for outputting the corresponding video images in digital form as a broadcast feed; and
virtual insertion means, located off-site from said at least camera and said mobile control room, for receiving said broadcast feed and for modifying the video images thereof by replacing a target portion of said processed images with a replacement pattern adapted to be inserted into the target portion.
Preferably, the system includes a plurality of cameras which are adapted to be active and means for determining which one of the plurality of cameras is presently active and for producing a corresponding output, and the virtual insertion means replaces a target portion of a video image from the active camera based on said output.
Advantageously, router means are provided which are housed separately from said mobile control room and which, during calibration of the system, receive the broadcast feed and individual direct feeds from each of said cameras and selectively output one of said feeds. The router means is used to facilitate the selection of a target or targets during a calibration process for each camera prior to broadcast wherein, e.g., keyed levels are adjusted.
In the embodiment wherein the system includes a plurality of cameras, there are preferably provided means for generating camera closure signals for indicating which of said plurality of cameras is active, and means for monitoring said camera closure signals to determine if a camera closure signal has been received for the camera whose video image is currently being received by the virtual insertion means.
Preferably, the system further comprises monitoring means for monitoring the graphics and special effects added to produce the video images of the broadcast feed and for producing an output indicating that a video image received by said virtual insertion means should not be modified thereby based on the nature of the graphics and special effects that have been added to the received video image. In one preferred implementation, the monitoring means produces said output when any of the added special effects is incompatible with the replacement pattern. In a further preferred implementation, the monitoring means produces said output when any special effect has been added to the received image to be processed. Advantageously, the monitoring means produces said output when any layer of the added graphics is inconsistent with the replacement pattern.
In a preferred embodiment, the at least one camera comprises a plurality of instrumented cameras each including sensor means associated therewith for producing operational data with respect to corresponding camera, and the system further comprises means for sending said operational data to said virtual insertion means for use in replacing the target portion of a video image of the broadcast feed with a replacement pattern consistent with the operational data. Advantageously, the operational data includes camera pan, tilt, focus and zoom.
In accordance with yet another aspect of the invention, a television system is provided wherein a target region in successive video images is replaced by a matching representation pattern adapted to be inserted into said target region, the system comprising:
a plurality of television cameras for, when active, producing a sequence of video images of a scene;
sensor means for each of said cameras for sensing a plurality of operational parameters associated with the corresponding camera and for producing a respective data output;
a mobile control room located on-site with said cameras and including image processing means for receiving said video images and for adding layers of graphics and special effects to said video images to produce resultant video images, and means for outputting the resultant video images in digital form as a broadcast feed;
local control means located on-site with said mobile control room for receiving the data outputs of said sensor means and for outputting a data signal; and
virtual insertion means, located off-site from the cameras, control room and local control means, for receiving the broadcast feed and the data signal and for modifying the video images of the broadcast feed by replacing a target portion of said video images with a replacement pattern adapted to be inserted into the target portion.
Preferably, the local control means further comprises means for determining which one of said plurality of cameras is presently active and for producing a corresponding output and the virtual insertion means receives a control signal based on that output and responsive thereto, replaces the target portion of a video image of the broadcast feed from the camera indicated to be active.
Advantageously, the local control means further comprises router means for, during calibration of the system, receiving said broadcast feed and individual direct feeds from each of said plurality of cameras and for selectively outputting one of said feeds.
In a preferred implementation, the mobile control room includes means for generating camera closure signals for indicating which of said plurality of cameras is active and the local control means includes logic control means for monitoring said camera closure signals to determine if a camera closure signal has been received for the camera whose video image is currently being received by virtual insertion means and for sending a corresponding control signal to said virtual insertion means. Advantageously, the logic control means produces an output indicating that the second in time of two cameras is active when closure signals for a first in time camera and the second in time camera are received at the same time.
The local control means preferably further comprises logic control means for monitoring the graphics and special effects added to produce the video images of the broadcast feed and for producing an output indicating that a video image received by said virtual insertion means should not be modified thereby based on the nature of the graphics and special effects that have been added to the received processed image. As discussed above, in one embodiment, the logic control means produces said output when any of the added graphics or special effects is incompatible with the replacement pattern. Preferably, the logic means produces said output when any special effect has been added to the received video image to be processed or when any layer of the added graphics is incompatible with the replacement pattern.
Further features and advantages of the present invention will be set forth in, or apparent from, the detailed description of preferred embodiments thereof which follows.