Systems for inserting electronic images into live video signals, such as described in U.S. Pat. Nos. 5,264,933 and 5,543,856 to Rosser, et al. and U.S. Pat. No. 5,491,517 to Kreitman et al., which are hereby incorporated by reference in their entirety, have been developed and used commercially for the purpose of inserting advertising and other indicia into video sequences, including live broadcasts of sporting events. These systems are called Live Video Insertion Systems (“LVIS”). To varying degrees of success, these systems seamlessly and realistically incorporate indicia into the original video in real time. Realism is maintained even as the original scene is zoomed, panned, or otherwise altered in size or perspective.
Live video insertion of indicia requires several steps. The event video must be recognized and tracked to obtain the camera perspective. The indicia have to be adjusted to match this camera perspective and potential occluding objects have to be detected prior to the actual insertion. In the systems discussed in U.S. Pat. No. 5,264,933 by Rosser, et al. and U.S. Pat. No. 5,491,517 by Kreitman et al. it was assumed that the broadcaster would perform the complete process, including recognition, tracking, creating an occlusion mask, warping inserts to correctly match the current image, and correctly mixing the original video, warped insert and occlusion mask.
FIG. 1 shows a simplified illustration of a conventional multi-camera system employing an LVIS. In this example, the multi-camera system is televising a tennis match. A plurality of cameras 12 is deployed around the tennis court 10 in order to provide different views of the match. The output of each camera 12 is supplied to a video production switcher 14 located on-site (e.g., in a production truck at the sporting venue), where a director creates a broadcast program from the feeds provided by the cameras 12. As the tennis match progresses, the director switches between camera feeds to create the broadcast program. For instance, when a player is serving, the director may select the feed from one of the cameras, and at a later point in the match, perhaps as the players are preparing for the next serve, the director may cut to the feed of a different camera to provide the viewer with a different view of the court 10.
Each camera 12 is provided with its own LVIS 13, an example of which is illustrated in FIG. 2. The various operations of these LVIS components are described in U.S. Pat. Nos. 5,264,933 and 5,543,856, as well as U.S. Pat. Nos. 5,892,554; and 5,953,076. Each LVIS 13 inserts indicia into the scene that is being filmed by its associated camera 12. Thus, as the director composes the broadcast of the tennis match, the feeds the director is working with already include the indicia inserted at its proper location.
Next, the indicia enhanced program is transmitted 30 to the end user 42 via, for example, satellite, cable, or fibers. On a television or any other display device 58, the viewer sees an advertisement logo AD1 on the tennis court 10. Although in reality the logo is not on the tennis court, the logo AD1 appears on the television screen seamlessly and realistically, as if the logo AD1 had actually been placed on the court, since it has been warped to account for the perspective of the specific camera view being shown, and since the players occlude the logo (instead of being covered by it) as they run over it during their match.
As explained above, the broadcast program is a sequence of different scene cuts. The program may show the view of a first camera, then cut to a second camera, return to the first camera, and cut to a third camera. The specific sequence of camera views is at the discretion of the director, who is provided at the production switcher 14 with feeds of all the cameras. Since the camera feeds already include the inserted indicia when they arrive at the production unit 14, a scene cut from a current view to a new view will not delay the appearance of the logo in the new view. The appearance of the logo in the new scene will be seamless.
FIG. 1 is an example of a system in which the video processing for indicia insertion occurs upstream of the production switcher. Since one LVIS is employed for each camera, the camera identification is known and constant during the broadcast. Thus, in the composed program created by the director from the various enhanced video feeds, the inserted logos will appear immediately after every scene cut. The immediate appearance of an inserted logo at every scene cut preserves the intended verisimilitude of the logo appearing in the video image as if it is actually a part of the real world surface being filmed. However advantageous such an approach may be from the perspective of preventing logo insertion delays at every scene cut, sometimes, for reasons such as reducing system and operation footprint, cost, or any other reasons that are unrelated to logo insertion, it is more desirable to insert the indicia downstream of the production switcher 14.
In contrast to the system of FIG. 1, FIG. 2 shows an arrangement where the LVIS 230 is downstream to the production switcher 220. In a downstream logo insertion system, the cameras 210a-c do not have their own respective LVIS's. Instead, one LVIS 230 is employed downstream to the production switcher 220. In this downstream system, the camera feeds are supplied, without any logo insertion, to the production switcher 220, where the director composes a broadcast program consisting of a plurality of scene cuts switching among the various camera feeds. For instance, the scene cuts may transition from one camera to another in order to follow the action occurring on the field. Downstream to the production switcher 220 (either at the sporting event site or in the central production facility, to which the composed program has been sent before being broadcast to the end viewers), a single LVIS 230 handles the indicia insertions for every scene cut in the program. For instance, if the program cuts from a camera filming the sidelines to another filming the baseline in a tennis match, the LVIS 230 must insert any required logos into the current video frame with respect to this current video's camera perspective. Before a logo can be inserted into a current video image frame by an LVIS 230, the camera 210a-c that filmed the current video image must be identified, as each camera 210a-c may have different data associated with it. Camera specific data may be the camera's location, field of view, occlusion parameters, insertion parameters (e.g. camera's specific artworks or insertion location). If the program is a “live” broadcast, the operator of the LVIS must ensure that the logo insertions are turned on as quickly as possible after a scene cut so as to not destroy the impression of seamlessness and realism necessary to create the illusion that the logos are actually part of the playing surface or stadium wall. This task becomes very difficult for programs that include a great deal of rapid scene cuts, such as those of rugby games, which require the director to make frequent scene cuts in order to capture all the unpredictable, chaotic action unfolding on the field.
Thus, in this system using a single LVIS 230 for inserting indicia into downstream video images originating from a plurality of cameras 210a-c, as the program cuts from one scene to another, an operator at the typical downstream logo insertion system must first manually identify the camera associated with the newest scene cut, and he must do so as quickly as possible. This is shown as user input 240 on FIG. 2. At a typical video frame rate of 30 frames (60 fields) per second, a viewer may detect an insertion delay that is as short as 1 field delay ( 1/60 second). It would be very difficult for an operator to manually switch the camera ID without causing noticeable insertion delay (a few seconds). As a result, logo will be faded-in into the video so as to alleviate the sudden “pop” effect. There is typically a cushion of a few seconds of time where the cut camera will be held steady before it moves, making the late insertion not too visually disturbing.
Therefore, there is a need to reduce or eliminate such delays, that shatter the illusion of realism that the LVIS is intended to instill in the end viewer, even more so in video programs composed of a large sequence of rapid scene cuts.