The present invention relates to a video image monitoring method and system for monitoring an object in an image picked up by a television camera. More particularly, the present invention relates to video image monitoring method and system for automatically detecting an object which intrudes in a view field of image pick-up of the television camera, from a video signal of the television camera and automatically tracing the object when the object moves.
A video image monitoring apparatus using a television camera (hereinafter referred to as a TV) has been widely used from the past. Recently, a system which automatically detects a moving object such as a human or an automobile which intrudes into a monitoring view field of such a system, from an image signal and reports or alarms under a predetermined condition, instead of detecting and tracing it by human on a monitor screen has recently been demanded.
In order to implement such a system, an input image derived from the TV camera and a previously inputted reference background image, that is, an image in which an object to be detected is not picked up are compared for each of corresponding pixels, and a intensity difference is determined for each pixel. Then, a pixel area having large difference is detected as an object image. This method is called a differential method and has been widely used from the past.
An application of the differential method is disclosed in U.S. patent application Ser. No. 08/646018 filed on May 7, 1996 based on JP-A-7-230301.
Referring to FIGS. 1A.about.1E, a process by the differential method is explained. First, an input image 601 (FIG. 1A) derived from a TV camera and a previously inputted reference background image 602 (FIG. 1B) are compared for each of corresponding pixels to determine an intensity difference for each pixel. Assuming that one pixel comprises eight bits, a binarized image 603 is produced by setting an intensity value of a pixel having the intensity difference which is smaller than a predetermined threshold to 0 (all of the eight bits are "0"), and the intensity value of a pixel having the intensity difference which is not smaller than the threshold to 255 (all of the eight bits are "1"). In FIG. 1A, numeral 500 denotes an area to be monitored and it is assumed here that the entire screen area is the area to be monitored. Numeral 504 denotes a gate, numeral 506 denotes a fence, numeral 502 denotes an off-limit area, numeral 508 denotes a line indicating a border of the off-limit area and numeral 510 denotes an object to be monitored, for example, human. Accordingly, in the binarized image 603, an image 510 of the human picked up in the input image 601 is detected as an image 604.
Automatic tracing of the detected object is conducted by serially detecting the object by the differential method for the sequentially inputted image and determining movement of the object based on positions of the object at respective detection times. For example, in the binarized image 605 of FIG. 1D, it is assumed that the object is detected by an image 606 at a time 60-2, by an image 607 at a time t0-1 and by an image 608 at a time t0. The movement of the object is represented by arrows 613 and 614 which connect centers of gravity 609, 610 and 611 of the binarized images 606, 607 and 608 at the respective times (see image 612 of centers of gravity shown in FIG. 1E).
The center of gravity may be determined by the following formula (1). ##EQU1## where C=(x, y) is a center of gravity, f(x, y) is a binarized image of the difference (255 when not smaller than the threshold and 0 when smaller than the threshold), and [B] is the number of pixels which makes f(x, y)=255.
A center of gravity by a secondary moment may be considered and any other method may be used so long as the binarized area can be represented by one coordinate.
In the tracing method of the object using such a differential method, since a view field is changed when a pick-up direction of the TV camera or a zoom ratio is changed so that a previously prepared background image can no longer be used, it is not possible to stably detect the object. Even if the presence of object is detected, the shape of the detected object is not compared at each time. Accordingly, it is not guaranteed that the detected object is absolutely the same object and that the object being traced is absolutely the same object. Accordingly, when a plurality of objects are present in the view field and the respective objects are detected and traced, stable tracing is not attained.
A known common template (fixed image) matching is now explained. For example, in the prior art pattern matching technique represented by a printed board test machine, a template is previously registered and (1) a portion of an image of the printed board to be tested which has a high degree of matching is detected (and a position of the printed board is corrected). Further, (2) the object is evaluated (detection of presence or absence of break) in accordance with the degree of matching. This is briefly explained with reference to FIG. 2.
In an example of FIG. 2, as shown by a template group 15 having parts "A", "B", "C" and "D" registered therein is previously registered, and the pattern matching is conducted in each of images 141, 142 and 143 to be evaluated to examine the states of the respective parts. The image 141 under evaluation is determined as normal. For the image 142 under evaluation, the part "C" is positionally deviated (it matches at a deviated position from a normal position), and the part "D" is determined as lacked (no matching in the image). For the image 143 under evaluation, the part "C" is determined to have break or print condition of the package is not good 0(low degree of matching). However, in this matching method, only the template of the registered parts can be evaluated. Namely, since the fixed template is used, correct defemination cannot be made when an apparent size of the part changes (for example, a distance between an image pick-up device and the part changes).
Accordingly, when this method is applied to trace an automobile, the template for all types of automobiles present in the world must be prepared.
In the differential method widely used from the past, when the view field of the camera or the zoom ratio is changed, the previously prepared background and the actually inputted background image are different as described above. In this case, when the difference is computed, the deference occurs in the background area and the detection of the object by the binarization is not attained. Namely, when the apparent size of the object changes, correct determination cannot be attained. Even hen the view field of the camera is fixed, if a plurality of objects are picked up in the view field, the shapes of the objects should be compared to determine whether the object currently being traced is same as the traced object at a different time in order to specify the currently traced object from those objects.
In the differential process, a technique to sequentially update the image of the template to the latest input image has been proposed. Such a technique is disclosed in "Traffic Flow and Congestion Measuring System by Image Processing", T. Kitamura et al., Papers of Second Image Sensing Symposium Lectures, May 1996, pp.293.about.296.
However, in such a method of updating the template, the object cannot be precisely traced when an apparent size of the object, that is, the size of the object on the screen changed by a change of a distance between the TV camera and the object or a change in the zoom ratio of the TV camera.