1. Field of the Invention
A system and method facilitates the insertion of dynamic and static images and other indicia into live broadcast video images on a real time basis so that they appear to be part of the original broadcast.
2. Description of Related Art
The present invention represents a significant improvement over various prior art approaches to the problem of inserting images into a live video broadcast. In particular, the prior art techniques suffer from the inability to rapidly detect and track landmarks and insert a dynamic or static image into a live video broadcast in a realistic manner. Moreover, many prior art techniques are computationally intense and require cumbersome and complicated computer systems to achieve their goals.
An early approach to video insertion is described in U.S. Pat. No. 4,539,585 entitled "PREVIEWER" and issued on Sep. 3, 1985 to Spackova, et al. According to that teaching, artificial landmarks, in the form of triangles, are placed on an individual. By lining up the artificial landmarks with corresponding points on an insertable image, it is possible to superimpose a variety of different inserts into the field of view. For example, it is possible, using the artificial triangle landmarks, to virtually place a variety of different clothing items onto a human model, presumably a prospective customer, so that he or she can preview the way he or she would look wearing that particular item of clothing. While the use of artificial landmarks may be acceptable in certain contexts, it does not work well where the background scene might be a large sports arena or the like because they must be large in order to be seen and, therefore, are cumbersome to install and may look strange in the context of a sporting event.
Another approach to the same problem is to place X and Y sensors on a camera. As the camera pans across a scene, the X and Y sensors track the position and movement of the camera. This technique has limited success in relatively small quarters, but if the field of view is a sports arena or like, the inherent error, or "jitter," in the X and Y sensors produces a noticeable, and unacceptable, error in the placement of the inserted image. This "jitter" is particularly objectionable during occlusion processing. U.S. Pat. No. 4,084,184 issued to David W. Crain on Apr. 11, 1978 demonstrates an early approach for using data obtained by sensors placed on or about a camera to aid in tracking images within a scene. In Crain, sensor means such as gyro compasses, potentiometers, inertial navigation instruments, and inclinometers are used to generate information regarding camera tilt angles, aperture angles, and the like. The use of X and Y encoders in the context of a video insertion system has also been described, among other places, in Patent Abstracts of Japan, "Picture Synthesizer," Vol. 15, No. 8 (E-1042) 8 Mar. 1991 and JP-A-02 306 782 (Asutoro Design K. K.) 20 Dec. 1990. It is also believed that the use of X and Y sensors has previously been used in Europe to assist in the placement of inserts into live video broadcasts.
More recently, efforts have been made to take advantage of pattern recognition techniques to identify landmarks that are naturally occurring within an insert target area. One of the earliest efforts to take advantage of improved pattern recognition techniques to identify natural landmarks on the edge or around an insert target area is described in U.S. Pat. No. 5,264,933 entitled "TELEVISION DISPLAYS HAVING SELECTED INSERTED INDICIA" issued on Nov. 23, 1993 to Rosser, et al. U.S. Pat. No. 5,264,933 was based, in part, on British Patent Application Serial No. 9102995.5 filed on Feb. 13, 1991 which was based on an earlier British Provisional Patent Application filed Feb. 14, 1990 which was further related to British Patent Application Serial No. 9019770.8 filed on Sep. 10, 1990 by Roy J. Rosser. U.S. Pat. No. 5,264,933 discusses, in detail, a method for placing a logo or other indicia into, for example, a tennis court during a live broadcast. In U.S. Pat. No. 5,264,933, a target zone is pre-selected for receiving insertable images into the broadcast image. The target zone is spatially related to certain landmarks that represent distinguishable characteristics of the background scene being captured by the camera. The system always looks for landmarks in the target zone but the patent also discloses the fact that landmarks outside of the target zone can be employed too. Landmarks identified by the processor during broadcast are compared against a reference set of landmarks identified in a reference image. When sufficient verification has occurred, the operator inserts an image into the pre-selected target zone of the broadcast image. For example, in a football game the target zone could be the space between the uprights of a goalpost. Or, in a baseball game, the target zone could be a portion of the wall behind home plate. A relatively exhaustive description of the prior art up to that date is set forth in U.S. Pat. No. 5,264,933 and the references cited therein. Some of the more relevant patent references cited in the foregoing patent include U.S. Pat. Nos: 3,731,188; 4,442,454; 4,447,886; 4,523,230; 4,692,806 and 4,698,843.
Rosser, et al., U.S. Pat. No. 5,264,933 describes, among other things, how the boundaries of a tennis court can be identified and used as landmarks for the purpose of inserting a commercial logo into a live broadcast. The landmarks are identified by means of a "Burt Pyramid." The Burt Pyramid technique is discussed in a number of patents, such as U.S. Pat. Nos. 4,385,322; 4,674,125; 4,692,806; 4,703,514 and 5,063,603, as well as in publications such as "Fast Algorithms For Estimating Local Image Properties," by Peter J. Burt, Computer Vision, Graphics and Imaging Processing, 21 pp. 368-382, 1983, and "Pyramid-Based Extraction of Local Image Features with Application to Motion and Texture Analysis" by Peter J. Burt, SPIE, Vol. 360, pp. 114-124. See also "Pyramidal Systems for Computer Vision," V. Cantoni and S. Levialdi, NATO ASI Series F, Vol. 25, Springer-Verlag, 1986; "Multiresolution Image Processing and Analysis," A. Rosenfeld, editor, Springer-Verlag 1984, and "Object Tracking With a Moving Camera: An Application of Dynamic Analysis" by P. J. Burt, et al., "Proceedings of the Workshop on Visual Motion," Irvine, Calif., Mar. 20-22,1989. The Burt Pyramid technique described above and known in the prior art involves the reduction of an image into decimated, low resolution, versions which permit the rapid location and identification of prominent features, generally referred to as landmarks. The Burt Pyramid is one of several well known, prior art, techniques that can be employed to identify landmark features in an image for the purpose of replacing a portion of the image with an insert in the context of a live video broadcast.
Luquet, et al., U.S. Pat. No. 5,353,392, discloses a system that is limited to modifying the same zone, referred to as a target zone, in successive images. Thus, by limiting the insertion operation to a pre-determined target area, Luquet '392 suffers from some of the same drawbacks as Rosser '933, namely, that the inserted image is tied to a fixed location, or target zone, within the overall image. The present invention, as discussed in the "Detailed Description of the Preferred Embodiment" later in this disclosure, is capable of inserting an image virtually anywhere within the overall broadcast scene independent of the identification of a specific insertion or target zone.
Thus the basic concept for many recent prior art inventions, such as set forth in U.S. Pat. Nos. 5,264,933 and 5,353,392 described above, is to replace a preselected region of the current image or an existing advertisement or target zone in the current image.
U.S. Pat. No. 5,107,252 entitled "VIDEO PROCESSING SYSTEM" and issued on Apr. 21, 1995, naming as inventors, Michael J. Traynar and Ian McNiel and assigned to Quantel Limited, Newbury, United Kingdom, is similar to these prior art approaches in that the edges of the insertion area itself are specifically identified with a stylus and thereby fixed in the scene.
Another system that is primarily directed towards the identification of at least some landmarks within a designated insertion area is described in PCT Application PCT/US92/07498 entitled "VIDEO MERGING EMPLOYING PATTERN-KEY INSERTION" claiming a U.S. priority date of Sep. 18, 1991 and an international filing date of Sep. 10, 1992 and listing as inventors Keith James Hanna and Peter Jeffrey Burt.
Zoom correction and occlusion processing are discussed in PCT application PCT/US94/11527 assigned to ORAD, Inc. According to that system sensors are placed on the periphery of the camera zoom lens. The sensors mechanically detect the rotation of the zoom lens and calculate a corresponding zoom factor. The zoom factor is then fed to a computer system to correct the size of the intended insert. Systems of this type suffer from mechanical drawbacks such as jitter which may introduce an error factor rendering the size of an insertable image unacceptably variable. The present invention overcomes such mechanical drawbacks by determining the changed positions of landmarks within the current image and automatically applying a corresponding zoom factor to the insertable image. The present invention relies on landmark positions within the current image and not on external factors subject to motion or jitter. Thus, any sudden, unwanted camera motion or lens movement will not affect the zoom adjustment calculations.
Other patents of possible relevance to the foregoing might include the following:
U.S. Pat. Nos. 4,528,589; 4,792,972; 4,817,175; 5,099,319; 5,142,576; 5,233,423; 5,309,174; 5,436,672; and PCT/GB90/00925.
Although '933 discloses insertion of video images in the insert location, the above prior art is generally directed towards the insertion of a static image, i.e., non-moving image, into a live video broadcast. Therefore, being able to identify the boundaries of a particular insertion, or "target," area may be important. The situation becomes much more difficult if it is desired to place a static image someplace other than in the "target zone" or to insert a dynamic image, i.e., one that can move, into a live video scene. The insertable image may be dynamic either in the sense that the image moves across the scene or the image itself changes from frame to frame, or both. Imagine, for example, the difficulties of superimposing a rabbit, beating a drum, simultaneously moving across the field of view into a live video broadcast.
Insofar as understood, none of the prior art described above, nor any known to the applicants, can efficiently and satisfactorily solve the problem of inserting static and/or dynamic images into a live video scene in as realistic a manner as the present invention.