An image monitoring system for estimating the position of an object is described in Patent Document 1. The system described in Patent Document 1 acquires images from each of a plurality of cameras so that a portion of the fields of the plurality of cameras mutually overlap with respect to a real space (three-dimensional space) in which the position of an object is to be estimated. A moving body region of the target object is then detected using background subtraction processing and frame subtraction processing. Each camera is calibrated in advance for real space. The system described in Patent Document 1 converts moving body regions detected from images of each camera into a planar coordinate system designated within each real space, detects overlapping of moving regions after conversion, and determines the presence of a real body region in real space or estimates the position where it is present.
[Patent Document 1] Japanese Patent Application Laid-open No. 2008-15573 (paragraphs. 0015 to 0046)
In the system described in Patent Document 1, moving body regions detected from images of each camera are converted to a planar coordinate system designated within each real space, and if converted moving body regions overlap with respect to all cameras, an object is estimated to be present at that position. In this system, the range of space in which the position of an object can be estimated is limited to overlapping regions of the fields of all cameras. For example, FIG. 19 indicates an example of a range over which object position can be estimated in a system relating to the present invention. In FIG. 19, the arrows represent the range of the fields of each camera. An overlapping region of the fields of cameras 101 to 103 shown in FIG. 19 is the region indicated with diagonal lines, and in the system described in Patent Document 1, only the position of an object present in this range can be estimated. Furthermore, in FIG. 19, real space is indicated schematically in two dimensions.
A technique has been considered for expanding the region in which position can be estimated in which, in the case moving body regions obtained from two cameras overlap, an object is determined to be present in that overlapping region. FIG. 20 indicates an example of a range over which an object position can be estimated in the case of using this technique. As shown in FIG. 20, the position of an object can be estimated within a range in which the fields of the two cameras overlap, and the range over which object position can be estimated is greater than that of FIG. 19. In this case, however, there are cases in which erroneous detection occurs. FIG. 21 indicates an example of the occurrence of erroneous detection in the case of expanding the range over which object position can be estimated. The case of estimating the positions of three objects 111 to 113 shown in FIG. 21 is used as an example. In addition, the broken line arrows shown in FIG. 21 represent the view volume for the objects. In the example shown in FIG. 21, if the regions of objects obtained from two cameras overlap, an object is determined to be present in that region. Accordingly, the ranges indicated with the bold lines are object detection regions and error occurs in these regions. Although the detection regions can be photographed with a camera other than the two cameras used to detect position in the state of those portions of the detection regions that are to the inside of the objects 111 to 113 in particular (regions indicated with diagonal lines), this ends up resulting in erroneous detection. For example, although the state of a region 115 indicated with diagonal lines can be photographed with the camera 102, the object ends up being determined to be present based on images obtained from the cameras 101 and 103.
As has been described above, when a region where an object position can be estimated is attempted to be expanded, there was the problem of the occurrence of erroneous detection and a decrease in the accuracy of estimation of object position.
In addition, the accuracy of estimation of object position also ends up decreasing if a stationary object is present that conceals a target object for which position is to be estimated. For example, if a stationary object such as a desk, billboard or pillar is present between a target object and a camera, occlusion occurs with respect to the target object and the accuracy at which object position is estimated ends up decreasing. This is because the target object is concealed by the presence of the stationary object, thereby preventing specification of a moving body region of the target object. Since a moving body region is unable to be specified, an erroneous determination is made that a moving body region is not present even though the target object is actually present.