In order to continuously shoot a subject, it is necessary to have a camera line of sight trained on the subject. This task is often undertaken manually by the photographer, but it is difficult to perfectly track something like the bouncing of a ball that is high speed and has irregular movement. For this reason, research into systems for automatically controlling line of sight direction of a camera mechanically (so-called Active Vision; refer to non-patent publication 1 below) has become widespread in many fields.
With normal Active Vision technology, since the camera itself is moved while being attached to a drive platform, there is a delay in response speed with respect to movement in the line of sight direction. This makes tracking of a moving object that includes sudden speed changes (for example, a ball being used in a ball game) difficult. If the fact that the frame rate of a high speed camera reaches 1,000,000 fps in faster applications, and actual conditions where image processing is carried out at high speed by GPUs, are considered, it can be said that line of sight control speed is a bottleneck with respect to speed in various tracking systems.
In order to solve this problem, an optical system known as a Saccade Mirror has been proposed, to carry out change of the line of sight of a camera at high speed using small drive mirrors arranged in front of a camera (refer to non-patent publication 2 below). With this technology, using two axis galvanometer mirrors makes high speed line of sight change possible. Regarding a control system, if it were possible to control line of sight so as to always keep a physical object in the center of the screen, can be considered that unprecedented dynamic shooting would become possible.
However, in tracking a physical subject, it is necessary to extract the physical object from within an image, and train the line of sight of the camera towards this physical object. As a method for extracting a physical object from within an image, there are, for example:
(1) a method of, after extracting a feature amount from within an image, identifying a physical object within the image by comparing with learned data that has been acquired by learning beforehand; and
(2) a method of acquiring a background image in advance, and identifying a physical object by comparing with an image (actual image) containing the physical object (the so-called background differencing method).
The method in (1) above has an advantage in that it is not necessary to acquire a background image, but since image processing time becomes long, it is ill suited to physical object identification in real time. Also, although this method depends of the content of learned data, it also tends to be inadequate in terms of accuracy of identifying a physical object.
The background differencing method of (2) above has the advantage that high-speed physical object identification is possible. However, with background differencing methods that have been proposed conventionally (refer, for example, to non-patent publications 3 and 4 below), it is assumed that an image has been acquired using a fixed viewpoint camera. It is considered difficult to directly apply these techniques to a camera in which viewpoint moves.