A monitoring system which uses a monitoring camera and monitors a given monitoring target area is known. Such a monitoring system is configured to make it possible to store video images captured by the monitoring camera into a storage device and retrieve and reproduce a desired video image when necessary. However, it is extremely inefficient to reproduce all of the video images and manually find a desired video image.
Therefore, there is a known technique which enables retrieval of a video image in which an object performing specific behavior is recorded by detecting and storing metadata of objects moving in the captured video images. Use of such a technique enables retrieval of a video image in which an object performing specific behavior is recorded from among a large number of video images captured by the monitoring camera.
As such a technique for retrieving a video image in which an object performing specific behavior is recorded from among a large number of video images, for example, Non-Patent Document 1 is known. Non-Patent Document 1 discloses a technique of extracting the shape feature and motion feature of an object and the shape feature of a background and compiling them into a database in advance, and creating a query with a hand-drawing sketch of three elements composed of the shape of a moving object, the motion of the moving object and a background and performing retrieval of a video image. In Non-Patent Document 1, a database is prepared by calculating the average vector of optical flows (OFs) in regions of a moving object in each of frame images and regarding continuous data obtained by gathering the vectors in the respective frame images as the motion feature of the moving object. Likewise, continuous data of vectors is extracted from a hand-drawing sketch drawn by the user and regarded as the motion feature of a query.    Non-Patent Document 1: Akihiro Sekura and Masashi Toda, “The Video Retrieval System Using Hand-Drawing Sketch Depicted by Moving Object and Background,” Information Processing Society of Japan, Interaction 2011
There is a case where a monitoring system using a monitoring camera needs to separate a monitoring target area into a plurality of regions determined in advance and, without considering a movement path in the regions, retrieve a video image of a person or the like moving from one specific region to another specific region. For performing such retrieval by using the technique disclosed in Non-Patent Document 1, there is a need to input a number of queries in which movement paths in the regions are different. This is because according to the technique disclosed in Non-Patent Document 1, a video image of a person or the like moving in the same regions but moving through a movement path in the regions different from a movement path of a query may be excluded from a retrieval subject.
However, in a case where there are a variety of movement paths in the regions, the number of movement paths for moving in the regions is huge, and it is actually difficult to input a number of queries including different movement paths in the regions. Therefore, complete retrieval is difficult.