The current state of the art in security systems indicates a major need for more intelligent systems. The technology available in the marketplace today does not respond well with the problem of maintaining sensitivity to real intruders while avoiding false alarms. The infrared based motion detection alarms are easily triggered by changes in lighting condition which cause temperature changes in the room. The ultrasonic motion detectors are set off by things like air conditioners and ringing telephones. The infrared, ultrasonic and microwave systems are all triggered by events such as curtains moving in a draft and leaves falling off plants, to say nothing of pets and small children moving through the scene. There is some work being done to put multiple sensors together too complement each other, but the combination technique is very naive (i.e. combining microwave and infrared sensors with an AND gate so as to require a positive response from both before the alarm is sounded).
The idea of security system using changes in video signal is not new. There are a number of patents which use this idea, but which are less effective than the current invention for a variety of reasons.
There has in fact been an obvious progression in the sophistication of video motion detectors designed for security systems. In the older patents, the current video image is compared with the last video image, and if any pixels have changed intensity by more than a specified threshold, the alarm is sounded. These systems have the obvious advantage of simplicity, but are severely lacking in their ability to avoid false alarms. Lighting changes, small movements due to moving drapes or swaying trees all set off this type of video security system. In addition, these systems have no way to distinguish between significant vs. non-significant movement of animate objects through the scene. These systems can't distinguish between the movement of pets or small children in the scene and the movement of real grownup intruders. These systems also have no way of distinguishing between movement of people in sensitive areas of the scene vs. legitimately traveled areas. For instance, in an art gallery these systems would be unable to distinguish between a guard or patron walking harmlessly through a gallery and an intruder walking directly up to a valuable exhibit.
The next patent in the evolutionary line of video motion alarms sacrificed some of the simplicity of the original systems in an attempt to deal with the problem of distinguishing between movement in sensitive vs. insensitive areas of the scene. Specifically, U.S. Pat. No. 4,458,266 requires the user to specify one or more rectangles in the image, called windows, which are designated as sensitive areas. If a pixel within one of these windows changes by more than a specified threshold, the alarm is sounded. There are a number of problems with this patent from a practical standpoint. First it requires a rather sophisticated user interface and sophisticated user interaction to configure the system. It requires a way of displaying for the user an image of the scene, and providing a means (which is unspecified in the patent) for allowing the user to indicate the regions in the image which should be considered sensitive areas. Another shortcoming is that this invention is still unable to ignore overall lighting variations in the scene, since a change in the illumination across the whole scene will cause lighting changes within the sensitive windows, which will result in sounding the alarm. The same is true for small and insignificant movements within the sensitive area; a significant change in even a single pixel within a sensitive window will set off the alarm. An even more damaging shortcoming of this patent is that it doesn't even solve the real problem it tries to address. Specifically, its not movement within windows of the image which are important, its movement within areas of the scene. Suppose for instance, a sensitivity window was defined as a box around the image location of the valuable exhibit in the art gallery example discussed above. If a patron walked between the camera and the exhibit, the upper part of his body will pass through the sensitivity window and set off the alarm, despite the fact that he is nowhere near the exhibit itself.
A more recent patent is U.S. Pat. No. 4,679,077. It sacrifices a great deal more in the area of simplicity and tries to use more complex AI techniques such as object recognition and template matching to distinguish between significant and insignificant image changes. There are three stages to the processing in this system.
The first stage is designed to provide insensitivity to lighting changes by comparing the current image with a large number of reference images for different times of day and illumination levels. If the current image is significantly differences from each of the reference images, it sounds the alarm. Obviously to take care of a wide variety of situations would require a large set of reference images, and a time consuming image comparison process.
The next stage requires sophisticated user interaction. Specifically, in the second stage, for each time of day and illumination level, the user must use a light pen to trace over lines in the image which should be considered significant (like doorways, etc.). These line drawings are converted to a symbolic "list of lines" format and stored as references. When the system is running it detects edges in the scene using a Robert's operator or similar technique. It then converts the edge image into the symbolic .list of lines format, and sequentially compares the current list with each of the relevant reference lists. If the current list of lines in the image differs from each of the reference lists (i.e. if important lines are missing or obscured), then an alarm is sounded. This system will be both computationally expensive and prone to errors because of mistakes in edge detection and in converting the edge data to a symbolic format. Furthermore, if someone leaves an object like a box within the cameras field of view, a user will have to reconfigure the system because the significant permanent edges in the image will have changed.
Finally, stage 3 of the invention requires even more complex user interaction and sophisticated processing. The user is required to draw with the light pen objects (such as people) which should be considered significant and the image locations in which their occurrence should result an alarm. These line drawings are converted to the symbolic format and stored as references. When running, the system appears to attempt object recognition by matching the reference objects with the lines extracted from the current scene. If the object is found in the current scene, and is in a location specified as a sensitive area, the alarm is sounded. Again, the difficulty of the object recognition task being attempted by this invention will severely degrade its performance.
In short, the complexity of the user interaction and processing required by this and other recent video security system seriously hinders their applicability. The computational requirements alone demand very expensive hardware, and even with the required hardware and a well trained user to configure it, the system will not perform robustly.