The present invention relates to a system for monitoring a picture which system uses an imaging device such as a television camera. More particularly, the invention relates to a method and an apparatus for sensing an object in the system for monitoring a picture which method and apparatus are preferable in the case of needing a high reliability such as the case of sensing an object coming into a dangerous zone.
In recent days, a picture monitoring system arranged to use a TV camera has been made public. Of such monitoring systems, there has been requested a system for automatically sensing an object such as a person or a car coming into a field of a monitor based on a video signal independently of human eyes and issuing a predetermined report and alarm. This type of automatic monitoring system is required to sense an object by processing a video signal produced by a TV camera.
As one of the systems for sensing an object by processing such a video signal, there has been known and widely used as a difference method wherein the latest picture coming to the system in sequence is compared with the one previous picture, a difference of a luminance for each corresponding pixel is obtained, and an area with a large difference as an object area.
The basic flow of the system for sensing a difference is detected value will be described with reference to the flowchart of FIG. 8. As shown, at a step 81, a current picture is captured from a TV camera. At the next step 82, a difference of a luminance is obtained between a pre-stored background picture and the captured picture. Then, at a step 83, binarization process is made for the difference value. At last, at a step 84, the digital value is compared with a predetermined threshold value for sensing if an object exists based on the compared result.
For example, as shown in FIG. 9A, assume that the background picture taken by a TV camera is a railroad crossing. The picture shown in the left hand of FIG. 9B is taken when one car 90 is coming into this crossing. The picture shown in the right hand of FIG. 9B is derived from the binarization of the difference value of the pixel data for each pixel between the background picture of FIG. 9A and the picture shown in the left hand of FIG. 9B.
In this type of system, partial breaks or gaps 91 appear in an object to be sensed as indicated in the picture shown in the right hand of FIG. 9B. These breaks or gaps may bring about an error in sensing an object.
However, this drawback may be overcome by applying a dilation and erosion operation to the system.
As an applied example of a system for sensing a moving object by processing a difference, for example, an article entitled "Intelligent Image Handling with Image Recognition Technique" written by Ueda et. al. has been published in "O plus E" No. 176, pp. 122 to 136.
The system for sensing a moving object described in this article is arranged to sense an object from three pictures continuing in time. As shown in FIG. 11, at first, assume that the three continuous pictures 111, 112, 113 in time are input. Luminance differences between the pictures 111 and 112 and between the pictures 112 and 113 are calculated. These two luminance difference values are binarized and then are subjected to the dilation and erosion operation, as a result of which pictures 114 and 115 are obtained. Next, these binary images (pictures) 114 and 115 are ANDed for deriving common portions between these pictures, from which portions a picture 116 of the object is obtained.
In the aforementioned disclosed method, however, the following erroneous sensing may take place. That is, though an object is not actually coming into the view field, an object is erroneously sensed. For example, as shown in FIG. 9C, if the change of weather or lighting conditions leads a change of brightness of an overall object to be imaged, an object 93 (railroad) may be erroneously sensed though no object is coming into the view field. As shown in FIG. 9D, if an object located out of the view field is made so luminous that a reflected light 94 may appear in the area within an imaging field, the reflected light is erroneously sensed as an object 95. Further, as shown in FIG. 9E, if a shadow 96 of the object located out of the field is falling on the area within the imaging field, the shadow may be erroneously sensed as an object 97.
By the way, this differential calculation needs a reference background picture (for example, FIG. 9A). Hence, the reference background picture is created in advance and then pre-stored in a memory. To create the reference background picture, there has been known a system arranged to use a video signal (normally, a video signal of one frame) taken when no object is coming into the view field as the reference background picture according to the operator's selection or a system arranged to average a predetermined number of frames composing a picture at each pixel and to use the averaged frames as the reference background picture.
In the former system, the instantaneous picture is selected based on human determination. If any noise is mingled into the video signal by chance, the noise is held in the background picture. It is thus difficult to obtain the exact background picture. Further, from a view point of the noise circumstances in the transmission system, the mingling of the noise is not so rare. In the practical use, therefore, it is a great obstacle.
In the latter system, to obtain the exact background picture, it is necessary to make the number of frames composing the picture to be averaged as numerous as 1500 frames, for example. As a result, a large time gap (about 50 seconds in a frame frequency of 30 Hz) takes place between a time when a picture for creating the reference background picture is input and a time when a difference is processed for sensing the object. Hence, for example, if the imaging field is gradually made gloomier at dusk, for example, the time gap makes it impossible to create a sufficiently exact reference background picture to allow the object to be sensed.