Today, there exist multiple examples of AR inserts within the domain of broadcast television. For example, a staple of many current television broadcasts of football games in the U.S. is the display of a virtual line on the playing field which encompasses the yard line which the offensive team must cross in order to achieve a first down. Another example of an AR insert during a sports broadcast is the placement of virtual advertisements into the stadium or arena where the game is being played. For example, during the television broadcast of a baseball game, a virtual advertising billboard may be placed onto the backstop behind home plate. The content of these virtual advertisements will typically be changed each inning in order to support multiple sponsors during the game. Another common example of an AR insert, within the domain of news broadcasts, is the creation of a virtual studio. Virtual studios typically involve the display of walls, desks, screens, and other studio equipment around a newscaster in order to give the impression that a full studio set has been constructed.
It should be noted that the overlay of an AR insert onto either static or moving objects is supported by the present invention. For example, a logo may be placed onto the hood of a moving car during an automobile race. The display of such a moving AR insert requires a system and method to support dynamic motion throughout the scene. The present invention includes such a method.
Referring to the Glossary section above, real world space is defined as the three dimensional physical space of the scene. Locations (coordinates) are defined within real world space, such as coordinates relative to the location of the broadcast camera. The units of measurement within the real world space coordinate system are required to be real world units, such as match with definition millimeters, feet, etc. A view modeling method may be considered “real world space dependent” if the method depends on knowledge of any locations or measurements within real world space; i.e., in real world units in the x, y and z directions, such as those relative to the camera.
The problems with a real world space dependent view modeling approach are related to the fact that both collecting and maintaining three dimensional real world space location and measurement information is often an imposing or even impractical task. With respect to the area of information collection, the gathering of highly accurate real world location and measurement information often involves the usage of specialized and expensive equipment, such as GPS systems, survey equipment, laser planes, or inertial navigation systems (e.g., see U.S. Pat. No. 4,084,184 to Crain and U.S. Pat. No. 6,266,100 to Gloudemans, et al.). The usage of such equipment implies that special training must be given to technicians who will be setting up and calibrating this equipment on-site at the broadcast venue. This limits the usefulness of such AR systems when used within a broadcast environment where television personnel who have not received special training will be required to set up and operate the AR system. Furthermore, the gathering of location and measurement information using such equipment is often time consuming. This means that AR systems which depend on this equipment may be impractical within a television broadcast setup environment where production costs have been trimmed by limiting on-site setup time for the television crew.
Maintenance of location and measurement information is also a problem with a real world space dependent view modeling approach. Consider a situation where the camera is accidentally moved (e.g., bumped by the operator) during a broadcast. Since, as discussed above, knowledge of the location of the camera relative to objects within the scene in three dimensional real world space is a required element of a real world space dependent solution, it will be required at that point to entirely reassess and recalibrate the location of the camera and perhaps any other required objects within the scene. This is potentially a very time consuming process, and may likely be impractical during an actual live event. In an analogous situation, the camera may be deliberately moved either just before, or during a broadcast in order to obtain a better view of the event. This situation presents similarly dire consequences to the real world space dependent view modeling method, as well.
An early method for view modeling that is based solely on camera sensor data is presented within U.S. Pat. No. 4,084,184 to Crain. Crain presents a method for transforming the location of an object within three dimensional real world space into a set of values which represent the location of the object within a TV raster signal generated by a camera. The following information is required to be known for the Crain method to function: (a) the precise three dimensional real world space location of the camera, (b) the precise three dimensional real world space location of the object, and (c) pan, tilt and zoom values for the camera. Means for (a) is stated to be an inertial navigation system, while means for (b) is stated to be a set of surveying instruments. Pan, tilt and zoom information is obtained via sensors attached to a broadcast camera. Given (a) and (b), it is obvious that the view modeling method disclosed in Crain is real world space dependent, and thus exhibits the general real world space dependency problems that were outlined within the previous section above. The view modeling methodology within the present invention addresses all of these problems due to the fact that the present methodology is real world space independent.
Another method for view modeling is presented within U.S. Pat. No. 6,266,100 to Gloudemans, et al. This method relies on the use of pan, tilt and zoom data originating from sensors attached to a broadcast camera, in combination with a three dimensional model of the scene. The method is real world space dependent due to the fact that three dimensional locations of objects within the environment space are measured, computed and utilized within the method. The preferred embodiment described in Gloudemans, et al. determines the location of the camera by (a) determining the real world space locations of at least three “fiducials” (landmarks), using a laser plane or other suitable method, (b) pointing the optical center of the camera to these landmarks and (c) using geometric equations, based on recorded pan, tilt and zoom values, to calculate the (x,y,z) location of the camera. Thus, due to this real world space dependency, the method also exhibits the general real world space dependency problems that were outlined within the previous section above.
Other real world space dependent view modeling methods within the prior art include: U.S. Pat. No. 6,384,871 to Wilf et al., U.S. Pat. No. 5,912,700 to Honey, et al., U.S. Pat. No. 6,154,250 to Honey, et al., U.S. Pat. No. 6,100,925 to Rosser, et al., U.S. Pat. No. 6,208,386 to Wilf, et al., and U.S. Pat. No. 6,201,579 to Tamir, et al., each of which is incorporated by reference herein.
The present invention addresses these real world space dependent view modeling issues by offering a real world space independent view modeling approach.
Many view modeling methods within the prior art are based on pattern recognition techniques. These pattern recognition based view modeling methods have many potential drawbacks. Distortion of the video signal, occlusion of landmarks (due to foreground activity within the scene), and changing environmental conditions (which may affect the appearance of landmarks) each may dramatically decrease view modeling accuracy. Delays due to significant processing overhead may also occur.
One example of a pattern recognition based view modeling method is presented within U.S. Pat. No. 5,808,695 to Rosser, et al. Pattern recognition techniques are used to track the motion of an object within a camera view. Template correlation is used to track fixed (background) landmarks within the camera view in order to provide positional information for objects within the camera view. The algorithms that comprise the method utilize only two dimensional camera view space; thus, the method appears to be real world space independent. However, since the method is based on pattern recognition, the problems listed above may occur.
Other pattern recognition based view modeling methods include those described in: U.S. Pat. No. 6,384,871 to Wilf, et al., U.S. Pat. No. 5,912,700 to Honey, et al., U.S. Pat. No. 6,154,250 to Honey, et al., U.S. Pat. No. 6,100,925 to Rosser, et al., U.S. Pat. No. 6,208,386 to Wilf, et al., U.S. Pat. No. 6,201,579 to Tamir, et al., U.S. Pat. No. 5,808,695 to Rosser, et al., U.S. Pat. No. 5,892,554 to DiCicco, et al., U.S. Pat. No. 5,627,915 to Rosser, et al., U.S. Pat. No. 5,903,317 to Sharir, et al., U.S. Pat. No. 5,264,933 to Rosser, et al., U.S. Pat. No. 5,436,672 to Medioni, et al., U.S. Pat. No. 5,515,485 to Luquet, et al., U.S. Pat. No. 6,181,345 to Richard, U.S. Pat. No. 6,304,298 to Steinberg, et al., U.S. Pat. No. 5,917,553 to Honey, et al., and U.S. Pat. No. 6,141,060 to Honey, et al., each of which is incorporated by reference herein.
The present invention addresses these pattern recognition issues by offering a view modeling solution which does not utilize any pattern recognition techniques.