Computer-Generated Imagery is increasingly present in film and TV production. Dedicated techniques are needed to ensure seamless compositing and interaction between the virtual and real elements of a scene.
In a typical scenario, the performance of real actors is composited with a virtual background. This is, for instance, the situation in a virtual TV studio, where the news presenter is filmed against a green background, and the furniture and background of the studio are inserted later as virtual elements. Chroma keying is used to matte out the silhouette of the journalist for compositing with the virtual elements in the scene.
It may also be that all the elements in the scene are virtual, but the animated parts (humans, creatures) are obtained from the performance of actors in a TV or film shooting studio.
A TV or film shooting studio is usually equipped with an optical motion capture system which consists of a camera setup and an acquisition system.
The camera setup consists of a set of calibrated cameras placed around a capture volume. Typically, the actors wear dedicated suits where physical markers are placed at the location of the main body articulations. The actors play the role of the film characters or virtual creatures inside the capture volume, as defined by the scenario.
The optical motion capture system tracks the locations of the physical markers in the images captured by the cameras. This data is fed into animation and rendering software that generates the appearance of virtual characters or creatures at each frame of the target production.
In the simplest situations, there is no interaction at all between the real and virtual elements in the scene, and the spatial separation between these elements is easy to achieve. This is for instance the case in a virtual TV news studio, where the only virtual element is the background located behind the presenter.
Even in the absence of interaction between the real and virtual elements, the compositing becomes more complex when real elements are partially occluded by virtual elements placed in front of them, as seen by the camera. Some form of real-time depth keying is then required to ensure proper management of the occlusions in order to avoid that, say, the leg of the presenter that should normally be masked by a virtual table in front of him does not appear in the composited image in front of the table.
Interactions between real and virtual elements are even more difficult to manage. Imagine, for instance, a news presenter is asked to lay his hand on a virtual table. The table is not physically present when the presenter is filmed making the hand gesture in the green-screen environment. A marking on the floor of the virtual studio may tell him where to stand in order to be correctly positioned with respect to the table, but telling him/her where exactly the hand should be placed in order to lie exactly on the surface of the table after it has been inserted in the picture would require a marker “floating in air”. This is impractical.
Arguably, a misplacement of the presenter's hand in this case could be fixed during the compositing phase by tweaking the viewpoint of the virtual camera. However, this solution would not be applicable to multiple interactions occurring with elements of a rigid virtual layout, since the adjustments would need to be different for each interaction.
The complexity of managing interactions between real and virtual elements is maximal when they are both moving. An example of such a situation would be, for instance, a film character represented by a real actor attempting to step into a virtual train, with the train already in motion. The actor filmed in the green screen environment would need to simulate grasping a handle in the door of a carriage while this door is translating, and accelerating. Adjusting the desired location of the actor's hand would require following over time some marking of the predefined trajectory of the carriage door handle in 3D space. No other solution to this problem other than ad-hoc fixes in the compositing phase was found in prior art.