A technique for combining a live video captured by a television camera with an electronic image, or the like, generated by a computer is heavily used in recent TV broadcasts. A commonly-known technique for synthesizing videos (images) is a chromakey synthesis technique. According to this technique, when an image which would be a foreground subject of a synthesized image, such as a person, is acquired as a live video, the foreground subject is located in front of, e.g., a blue canvas (blue background), and captured by a television camera. As a result, a key signal to be used for distinguishing the area of the foreground subject from the area of the blue background is generated from the video signal. In accordance with the key signal, the image about the area of the blue background is replaced with a background image generated by a computer or the like or a background image captured at a different time or in a different location, whereby the foreground subject and the background image are merged together.
In recent years, in addition to mere merging of videos by means of chromakey synthesis, a video synthesis system (virtual system) called a virtual studio has come into frequent use. In this system, a subject captured by a television camera as a live video is made to appear to actually exist in a virtual space (virtual studio) generated as an electronic video.
In the virtual system, a desired virtual space is generated by a computer or the like, and a virtual camera (imaginary camera) is placed within the virtual space. Thereby, photography of the inside of the virtual space is imaginarily carried out by the virtual camera, so that an electronic video of the virtual space is generated.
Further, in the virtual system, photographing conditions of the virtual camera are changed in the same manner as are photographing conditions for a camera work, such as zooming action, focusing action, and pan-and-tilt action of a television camera for capturing a live video (called a live camera). An electronic video of the virtual space associated with the camera work of the live camera is generated by the virtual camera.
The electronic video of the virtual space generated by the virtual camera is merged with the live video captured by the live camera by means of chromakey synthesis or the like, thereby generating a synthesis image showing that the subject of the live video appears to exist in the virtual space.
Many of such virtual systems require information about an angle of view, an object distance (i.e., a distance from a taking lens of a live camera to an object (or subject)), and a principal point position as photographing data which show settings of photographing conditions of the live camera, in order to cause the photographing conditions of the live camera to accurately coincide with those of the virtual camera. In some of the related-art virtual systems, information about a zoom position and a focus position of a taking lens detected by a position sensor, such as an encoder, as described in Japanese Patent No. 3478740, is delivered directly to the virtual system. The information, however, does not directly show an angle of view, an object distance, or a principal point position.
For this reason, in the virtual system requiring information about an angle of view, an object distance, and a principal point position with respect to a live camera, the angle of view, the object distance, and the principal point position are computed on the basis of the information about the zoom position and the focus position acquired from the taking lens (lens device) of the live camera.
Computation requires lens data unique to a lens device (a lens device used in the virtual system). The lens data must have been generated and registered in the system in advance.
Currently, generation and registration of lens data are performed by the user. For instance, an actual measurement subject used for actually measuring an angle of view, an object distance, and a principal point position is photographed through use of, e.g., a television camera to be used for photographing an actual image. Images are acquired at various positions by changing the zoom position and focus position of the taking lens. The angle of view, the object distance, and the principal point position at each of the zoom and focus positions are actually measured from the captured images. Lens data representing a relationship among the zoom position, the focus position, the angle of view, the object distance, and the principal point position are generated through actual measurement. The lens data are then stored in memory to which the virtual system makes reference.
At the time of actual photographing of a live video, an angle of view, an object distance, and a principal point position, all corresponding to a zoom position and a focus position given by the taking lens (lens device) of the actual camera, are determined from the lens data. Photographing conditions of a virtual camera are set on the basis of the angle of view, the object distance, and the principal point position. Thus, the photographing conditions of the virtual camera are changed in accordance with settings of the photographing conditions of the actual camera, and an electronic video in a virtual space generated by the virtual camera is changed in association with a camera work of the actual camera. When a photographing direction (a pan-and-tilt position) of the actual camera is changed by means of a pan head or the like, information about the pan-and-tilt position output from the pan head is delivered to the virtual system, whereby the photographing direction of the virtual camera is also changed.
However, when the angle of view, the object distance and the principal point position are computed from information about the zoom position and the focus position, both being detected by the position sensor of the lens device, as in the related-art virtual system, the user must perform operation for actually measuring the lens data representing the relationship among the zoom position, the focus position, the angle of view, the object distance, and the principal point position and storing the lens data beforehand. These operations sometimes involve consumption of at least several hours or up to several days. Thus, the operations require much time and labor. Positions, such as the zoom position and the focus position, at which an angle of view, an object distance, and a principal point position are actually measured, require at least several points up to tens of points. If the number of points is reduced for saving time, there will arise a so-called slippage phenomenon of a mismatch occurring between a live video and an electronic video.