The present invention pertains generally to chroma-key and virtual set technology and in particular to systems and methods for generating multi-layer virtual presentations incorporating three-dimensional graphics or animations with a live or pre-recorded two-dimensional video image of a presenter or other object positioned between foreground and background layers of the three-dimensional image.
Television news and, in particular, weather broadcasts, commonly employ chroma-key or virtual set technology as part of the news and/or weather presentation. Such technology enables a presenter in a studio to appear as if he is in a more complex environment. Using this technology, scenes including images from a number of image sources, such as live video and computer graphics, can be created and combined together into a three-dimensional virtual presentation.
In many television programs which are broadcast from television studios, live video is combined with background images which were prepared in advance. The typical technology employed to create such combined images is called xe2x80x9cchroma-keyxe2x80x9d. The background images used in this technology can be still photographs, videotaped material, computer generated graphics, or any other image or compilation of images.
In generating a presentation using chroma-key technology, a presenter (the news caster, weather forecaster, etc.) stands in front of a colored or patterned screen in the studio. A television camera shoots both the presenter (live video) and the screen. The resulting picture is then transferred to a chroma-keyer for processing. At the same time, a background picture from a different source (such as another camera, pre-taped video, or computer graphics) is transferred to the chroma-keyer.
Both pictures, the live and the background picture, are combined in the chroma-keyer and broadcast as one picture which shows the live video on top of or in front of the background. For example, the final result can be a weather forecaster standing in front of a computer generated virtual weather map which cannot be seen at all in the physical studio.
The chroma-keyer differentiates between the live video image of the presenter and the screen according to the pixels (picture elements). Wherever a pixel from the live video image of the presenter is identified, it is transferred to the combined broadcast picture. Wherever a pixel from the screen is identified, the appropriate pixel from the background is placed in its place in the broadcast picture. In this way, a new picture is created and broadcast using the background which was chosen.
Many broadcast or other video presentations involve the use of three-dimensional graphics or animations. In some situations, it is desirable to place live or pre-recorded two dimensional video elements into a three-dimensional scene that has been rendered by a computer. Chroma-keying and virtual set technology may be used to generate such a video presentation. For example, a multi-layer video presentation may be created where a first layer consists of a computer generated graphics background, a second layer includes live video, e.g., of a presenter, hiding parts of the first layer, and a third layer consists of additional computer graphics, hiding parts of both the first and second layers. The generation of such a three-dimensional multi-layer presentation is typically accomplished in real time. In order to accomplish this effect in real time and fully automatically, the location of live video objects, e.g., the presenter, in the virtual space must be known.
The three-dimensional location of a live video object or presenter in the three-dimensional virtual space of a multi-layer video presentation to be generated may be derived using three cameras positioned in a triangle pointing to the center of a chroma-key stage, to capture the contour of the live video object or presenter from three different directions. One of the three video cameras may be designated the main camera. The virtual environment, or three-dimensional set database, is created using a computer. For each frame in the video, the virtual image is calculated according to the main camera position image. Each of the three cameras see the presenter as a two-dimensional image. After filtering out the screen background using a chroma-keyer, the contour image of the presenter remains. This shape represents the physical volume from the camera""s point of view to the stage surface. By utilizing the inputs of all three cameras in the triangle, the approximate location of the presenter within the three-dimensional virtual image can be obtained using the cross-section of the overlapping volumes. The cross-section of the overlapping volumes represents the object volumetric image. By obtaining the presenter""s three-dimensional volumetric shape, the depth location of the presenter on the stage can be obtained. The depth location allows a depth value (Z value) to be assigned to each pixel of the presenter""s image. Once the depth location of the presenter in the set is known, it can be calculated which virtual objects will appear behind the presenter and which virtual objects will appear in front of him. Thus, a multi-layer video presentation may be generated.
Other techniques have also been developed for generating a three-dimensional multi-layer virtual presentation in a chroma-key/virtual set technology environment. However, all such systems typically involve generating the multi-layer presentation in real time by a host computer using two variables which are provided to the process in real time, the position and pointing direction of a video camera, and a position, in real three-dimensional space, of the presenter. The live presenter is placed in the chroma-key set, and the position information of the location of both the camera and the actor is fed in real time to a host computer, which generates a virtual three-dimensional scene. The rendered three-dimensional scene and the live video are then keyed together to form the multi-layer presentation.
Such real-time methods for generating multi-layer three-dimensional virtual presentations, combining a computer generated three-dimensional scene and a two-dimensional video image, have several significant limitations. All such current methods utilize a great deal of expensive hardware and software to insert two-dimensional video elements into the three-dimensional scene in real time (30 fps). Elaborate systems, such as that described above, are used to determine the location of a person in front of a chroma-key wall. The video of the person is then rendered into the three-dimensional scene in real time using high-end computer systems. Since the rendering of the virtual three-dimensional scene is done in real time, the quality of the rendered scene is limited. For example, in the production of a live weather segment for broadcast news, it may be desired to render complex three-dimensional weather scenes. Even with high-end computers, however, the number of polygons (i.e., the complexity) in the three-dimensional scene that can be rendered in real time is limited to the rendering power of the computer. For this reason, extremely detailed and complex three-dimensional objects cannot be created. Furthermore, the complexity of the process described above for generating a multi-layer presentation combining a computer generated virtual three-dimensional scene and a live video presenter is difficult to control in the normally short production times typical of broadcast news. Also, in such systems, there is usually a delay of several frames through the system, which makes it difficult for the live presenter to match his movements to objects in the computer-generated virtual scene. The challenge is, therefore, how to include a two-dimensional video element into an extremely complex three-dimensional scene without the use of very expensive and elaborate hardware set-ups.
The present invention provides a simplified system and method for providing a high quality virtual three-dimensional presentation, for live weather segments of broadcast news, and the like. A virtual three-dimensional presentation in accordance with the present invention includes three layers, a background layer, a live or recorded video layer, and a foreground layer. The background and foreground may form, for example, a computer-generated three-dimensional scene, such as a weather scene. The background and foreground may be pre-rendered, allowing as much time as needed to produce a high quality and complex video animation for the background and foreground, without need for the most powerful and expensive computers. The pre-rendered background and foreground are combined with live or recorded two-dimensional video, e.g., of an actor or presenter, e.g., at the time of broadcast, to provide a high quality virtual three-dimensional presentation with the presenter having the three-dimensional background scene behind him and the three-dimensional foreground scene in front of him.
To generate a high quality virtual three-dimensional presentation in accordance with the present invention, a user defines a three-dimensional scene to be generated by, e.g., a computer graphics system. A surface, called a Z-sphere surface, is defined within the virtual space of the three-dimensional scene. The Z-sphere is defined by the aim point of a virtual camera looking at the three-dimensional scene to be created in virtual space. The Z-sphere surface extends in all directions. Its size is determined by the distance from the virtual camera to the aim point.
The Z-sphere, defined in virtual space with reference to the aim point of a virtual camera, splits the virtual three-dimensional scene into two parts. Those parts of the three-dimensional scene that are behind the Z-sphere (with respect to the virtual camera) form the three-dimensional background. Those parts in front of the Z-sphere form the three-dimensional foreground. The background and foreground of the virtual three-dimensional scene are pre-rendered separately. Since the background and foreground are pre-rendered, a system in accordance with the present invention can take as long as necessary to render high quality and complex background and foreground scenes. For example, a computer may be employed to render high quality complex animated weather scenes to appear as the background and foreground in a virtual three-dimensional weather presentation.
Some or all of the elements in the three-dimensional scene may have some form or degree of transparency. This transparency allows background objects to be seen through foreground objects, depending upon how transparent the objects are. For a two-dimensional video element to be placed realistically in such a three-dimensional scene, it is important that this transparency appear as the human eye would expect it to. For example, objects in a two-dimensional video appearing behind the foreground in the virtual three-dimensional scene should be visible through transparent objects in the foreground. Thus, the foreground scene may have a key signal (image) rendered with it, which represents the transparency of three-dimensional foreground elements in the foreground. Using the foreground transparency key, objects in the foreground can be made to appear transparent to two-dimensional video elements, such as a presenter, behind the foreground objects in the virtual three-dimensional presentation.
The pre-rendered three-dimensional background and foreground scenes are combined with a live or pre-recorded two-dimensional video insert layer to form a complete virtual three-dimensional presentation. A presenter, or other object, is positioned in a chroma-key set, preferably having both a back wall and a floor painted with the key color. A camera is directed on the presenter in the chroma-key set to obtain live or recorded video thereof. The two-dimensional video scene thus generated includes a key signal that is used to isolate the subject, e.g., the presenter, from the set.
The live or recorded two-dimensional video scene is combined with playback of the pre-rendered three-dimensional background and foreground scenes to form the complete virtual three-dimensional presentation. This may be accomplished using any appropriate method for compositing images. For example, this may be accomplished by first compositing the two-dimensional insert video and key signal over the pre-rendered three-dimensional background, and then compositing the pre-rendered three-dimensional foreground video and key on top of the first composite, or vice versa. In the complete virtual three-dimensional presentation, the background scene will appear behind the presenter positioned at the Z-sphere in the scene, and foreground layer objects will appear in front of the presenter. Thus, the illusion of a two-dimensional video element (e.g., a presenter) existing within a three-dimensional scene is created without the use of expensive hardware set-ups.
In compositing the live or recorded two-dimensional video scene with the pre-rendered three-dimensional foreground and background scenes, it is important that the resulting combined scene appear as a single scene, without any distortion between the scene layers. Depending upon the compositing process employed, rounding and other errors may cause a perceptible distortion between the composited three-dimensional foreground and background layers. This may be minimized by rendering the three-dimensional background scene to incorporate both the three-dimensional background scene and the three-dimensional foreground scene, i.e., the entire three-dimensional scene. The three-dimensional foreground scene is rendered, as described above, as that part of the three-dimensional scene which appears in front of the Z-sphere. When the live or recorded two-dimensional video scene is composited with the pre-rendered three-dimensional foreground and background scenes, the pre-rendered three-dimensional background scene is used for the entire part of the scene behind the Z-sphere and the part of the scene in front of the Z-sphere which is not in front of objects in the live or recorded two-dimensional video layer. Only the portion of the three-dimensional foreground scene which is to appear in front of objects in the two-dimensional video layer is employed in the composite.
An additional technique may be employed in accordance with the present invention when, e.g., the virtual three-dimensional presentation contains a curved three-dimensional floor upon which a two-dimensional video element, e.g., a presenter, is to be placed. Such a technique may be employed, for example, to create a three-dimensional video presentation of a person walking on a curved part of the earth""s surface. To create such an illusion, the two-dimensional insert video and key signals are processed before being composited with the pre-rendered three-dimensional foreground and background. The two-dimensional insert video and key signals are distorted so that the sides of the two-dimensional image defined by the video signal are perpendicular to the arc of a sphere or other shape upon which the two-dimensional video element is to be placed in the three-dimensional scene. The top and bottom of the two dimensional image is also distorted to match the shape of the virtual three-dimensional floor in the scene. When the two-dimensional insert video and key signal are then composited with the three-dimensional foreground and background, a two-dimensional video element, e.g., a person, will appear to follow the curvature of the floor in the virtual three-dimensional scene as it moves from side to side in the two-dimensional video insert.
A system for generating a virtual three-dimensional presentation in accordance with the present invention includes a chroma-key set, a video camera, and other equipment for providing a live or recorded two-dimensional video scene including, e.g., a presenter, plus the chroma-derived key from the set, play-back devices for playing back pre-rendered background and foreground scenes, and keyer devices for combining the two-dimensional video with the background and foreground scenes to form the virtual three-dimensional presentation. A high-quality virtual three-dimensional presentation generated in accordance with the present invention may be broadcast live, or recorded, e.g., to videotape, for storage and/or later broadcast.
Further objects, features, and advantages of the invention will be apparent from the following detailed description taken in conjunction with the accompanying drawings.