Television news and, in particular, weather broadcasts, commonly employ chroma-key or virtual set technology as part of the news and/or weather presentation. Such technology enables a presenter in a studio to appear as if he is in a more complex environment. Using this technology, scenes including images from a number of image sources, such as live video and computer graphics, can be created and combined together into a three-dimensional virtual presentation.
In many television programs which are broadcast from television studios, live video is combined with background images which were prepared in advance. The typical technology employed to create such combined images is called “chroma-key”. The background images used in this technology can be still photographs, videotaped material, computer generated graphics, or any other image or compilation of images.
In generating a presentation using chroma-key technology, a presenter (the news caster, weather forecaster, etc.) stands in front of a colored or patterned screen in the studio. A television camera shoots both the presenter (live video) and the screen. The resulting picture is then transferred to a chroma-keyer for processing. At the same time, a background picture from a different source (such as another camera, pre-taped video, or computer graphics) is transferred to the chroma-keyer.
Both pictures, the live and the background picture, are combined in the chroma-keyer and broadcast as one picture which shows the live video on top of or in front of the background. For example, the final result can be a weather forecaster standing in front of a computer generated virtual weather map which cannot be seen at all in the physical studio.
The chroma-keyer differentiates between the live video image of the presenter and the screen according to the pixels (picture elements). Wherever a pixel from the live video image of the presenter is identified, it is transferred to the combined broadcast picture. Wherever a pixel from the screen is identified, the appropriate pixel from the background is placed in its place in the broadcast picture. In this way, a new picture is created and broadcast using the background which was chosen.
Many broadcast or other video presentations involve the use of three-dimensional graphics or animations. In some situations, it is desirable to place live or pre-recorded two dimensional video elements into a three-dimensional scene that has been rendered by a computer. Chroma-keying and virtual set technology may be used to generate such a video presentation. For example, a multi-layer video presentation may be created where a first layer consists of a computer generated graphics background, a second layer includes live video, e.g., of a presenter, hiding parts of the first layer, and a third layer consists of additional computer graphics, hiding parts of both the first and second layers. The generation of such a three-dimensional multi-layer presentation is typically accomplished in real time. In order to accomplish this effect in real time and fully automatically, the location of live video objects, e.g., the presenter, in the virtual space must be known. The three-dimensional location of a live video object or presenter in the three-dimensional virtual space of a multi-layer video presentation to be generated may be derived using three cameras positioned in a triangle pointing to the center of a chroma-key stage, to capture the contour of the live video object or presenter from three different directions. One of the three video cameras may be designated the main camera. The virtual environment, or three-dimensional set database, is created using a computer. For each frame in the video, the virtual image is calculated according to the main camera position image. Each of the three cameras see the presenter as a two-dimensional image. After filtering out the screen background using a chroma-keyer, the contour image of the presenter remains. This shape represents the physical volume from the camera's point of view to the stage surface. By utilizing the inputs of all three cameras in the triangle, the approximate location of the presenter within the three-dimensional virtual image can be obtained using the cross-section of the overlapping volumes. The cross-section of the overlapping volumes represents the object volumetric image. By obtaining the presenter's three-dimensional volumetric shape, the depth location of the presenter on the stage can be obtained. The depth location allows a depth value (Z value) to be assigned to each pixel of the presenter's image. Once the depth location of the presenter in the set is known, it can be calculated which virtual objects will appear behind the presenter and which virtual objects will appear in front of him. Thus, a multi-layer video presentation may be generated.
Other techniques have also been developed for generating a three-dimensional multi-layer virtual presentation in a chroma-key/virtual set technology environment. However, all such systems typically involve generating the multi-layer presentation in real time by a host computer using two variables which are provided to the process in real time, the position and pointing direction of a video camera, and a position, in real three-dimensional space, of the presenter. The live presenter is placed in the chroma-key set, and the position information of the location of both the camera and the actor is fed in real time to a host computer, which generates a virtual three-dimensional scene. The rendered three-dimensional scene and the live video are then keyed together to form the multi-layer presentation.
Such real-time methods for generating multi-layer three-dimensional virtual presentations, combining a computer generated three-dimensional scene and a two-dimensional video image, have several significant limitations. All such current methods utilize a great deal of expensive hardware and software to insert two-dimensional video elements into the three-dimensional scene in real time (30 fps). Elaborate systems, such as that described above, are used to determine the location of a person in front of a chroma-key wall. The video of the person is then rendered into the three-dimensional scene in real time using high-end computer systems. Since the rendering of the virtual three-dimensional scene is done in real time, the quality of the rendered scene is limited. For example, in the production of a live weather segment for broadcast news, it may be desired to render complex three-dimensional weather scenes. Even with high-end computers, however, the number of polygons (i.e., the complexity) in the three-dimensional scene that can be rendered in real time is limited to the rendering power of the computer. For this reason, extremely detailed and complex three-dimensional objects cannot be created. Furthermore, the complexity of the process described above for generating a multi-layer presentation combining a computer generated virtual three-dimensional scene and a live video presenter is difficult to control in the normally short production times typical of broadcast news. Also, in such systems, there is usually a delay of several frames through the system, which makes it difficult for the live presenter to match his movements to objects in the computer-generated virtual scene. The challenge is, therefore, how to include a two-dimensional video element into an extremely complex three-dimensional scene without the use of very expensive and elaborate hardware set-ups.