In cinematography it is common to combine computer generated graphics with live-action footage. For example, the live-action footage may be shot against a monochrome screen for a chroma key compositing process. To enable post-production effects artists to be able to position virtual objects correctly relative to the live-action footage requires knowledge of the motion and orientation of the camera system relative to the filmed scene (known as “motion tracking”), as well as requiring knowledge of the optical settings of the cinema camera lens assembly (commonly referred to as the “lens”) that were used during the filming. With this knowledge, a 3D virtual scene can be viewed as a 2D projection from the standpoint of a virtual camera that follows a path corresponding to the path that the camera took in filming the live-action footage. This enables virtual objects to be viewed on the correct 2D projection for insertion into the live footage with the correct perspective, scale, orientation and motion, relative to the objects that are shown in the live-action footage.
Knowledge of the optical settings of the lens is commonly available, since a cinema camera system typically records the optical settings of the lens assembly corresponding with each frame of the recorded live footage, e.g. recording the zoom length, focal distance, and the aperture of the diaphragm iris.
Knowledge of the position and orientation of the camera system relative to the live-action that was filmed is not so easily available. Whilst in some cases, it is known to move the camera system along a pre-planned route, or to use a receiver on the camera system to detect the position of the camera system relative to an arrangement of ultrasonic or infrared transmitters or retroreflectors pre-positioned on the film set, such approaches are unattractive during filming. Rather, the 2D recorded images in a sequence of frames are commonly analysed in post-production to construct a 3D image of easily identifiable objects (“tracking markers”) and to determine the relative location and movement of the camera system. However, where the filmed scene lacks clearly identifiable features, such as distinct points and edges that are fixed within the scene, such a procedure is difficult to automate, or in some cases the calculation may not be possible, requiring the movement and orientation of the camera system to be estimated based on the starting and end positions, which is accordingly time consuming and/or lacking in precision.