1. Field of the Invention
The present invention generally relates to video surveillance, and more specifically to improved systems and methods for searching for changes in an area-of-interest (AOI).
2. Brief Description of the Prior Art
The current heightened sense of security and declining cost of camera equipment have resulted in increased use of closed circuit television (CCTV) surveillance systems. Such systems have the potential to reduce crime, prevent accidents, and generally increase security in a wide variety of environments.
A simple closed-circuit television system uses a single camera connected to a display device. More complex systems can have multiple cameras and/or multiple displays. One known type of system is the security display in a retail store, which switches periodically between different cameras to provide different views of the store. Higher security installations, such as prisons and military installations, use a bank of video displays each displaying the output of an associated camera. A guard or human attendant constantly watches the various screens looking for suspicious activity.
More recently, inexpensive digital cameras have become popular for security and other applications. In addition, it is now possible to use a web cam to monitor a remote location. Web cams typically have relatively slow frame rates, but are sufficient for some security applications. Inexpensive cameras that transmit signals wirelessly to remotely located computers or other displays are also used to provide video surveillance.
As the number of cameras increases, the amount of raw information that needs to be processed and analyzed also increases. Computer technology can be used to alleviate this raw data processing task, resulting in a new breed of information technology device—the computer-aided surveillance (CAS) system. Computer-aided surveillance technology has been developed for various applications. For example, the military has used computer-aided image processing to provide automated targeting and other assistance to fighter pilots and other personnel. In addition, computer-aided surveillance has been applied to monitor activity in swimming pools. CAS systems may be used to monitor a particular AOI if, for instance, the AOI includes a particularly valuable object.
CAS systems typically operate on individual video frames. In general, a video frame depicts an image of a scene in which people and things move and interact. Each video frame is composed of a plurality of pixels which are often arranged in a grid-like fashion. The number of pixels in a video frame depends on several factors including the resolution of the camera, and the display, the capacity of the storage device on which the video frames are stored. Analysis of a video frame can be conducted either at the pixel level or at the (pixel) group level depending on the processing capability and the desired level of precision. A pixel or group of pixels being analyzed is referred to herein as an “image region”.
Image regions can be categorized as depicting part of the background of the scene o depicting a foreground object. In general, the background remains relatively static in each video frame. However, objects may be depicted in different image regions in different frames. Several methods for separating objects in a video frame from the background of the frame, referred to as object extraction, are known in the art. A common approach is to use a technique called “background subtraction.” Of course, other techniques can be used as well.
Current surveillance systems provide a rudimentary techniques for performing area change searches. Such a system may allow a user to specify a specific AOI within the video frame in which to search for a change. The system then searches through each video frame and measures the number of changed pixels within the AOI. If the number of changed pixels within the AOI in a particular frame surpasses a specified percentage, then that frame is returned as a positive result in the search. This approach may be referred to as frame-by-frame differencing.
Frame-by-frame differencing, however, has a number of drawbacks. In particular, it may return too many false positive results. These false positive results could be due to obstructions moving in front of the AOI. For example, if a user is interested in searching for the moment when a laptop that was sitting on a desk was stolen, then using this search technique will return all instances when a person walks in front of the desk and occludes the laptop from view (assuming of course that the number of pixels that changed due to the person walking in front of the desk exceeds the specified percentage). In most cases, the person subsequently moves away from the desk and reveals the un-stolen laptop, at which point the search has returned a false positive.
Another approach is to utilize background subtraction to perform the analysis. In a typical background subtraction algorithm, foreground pixels are separated from background pixels by subtracting a video frame from a “background image.” This background image is periodically updated with new data in order to track slow changes to the background (e.g., lighting changes). Typically the background update is performed by averaging newly classified background pixels with the existing background image. Foreground pixels are not averaged with the background to prevent “pollution” of the background image. In this way, the background image adapts to slow or small color changes, and all fast or large color changes are considered foreground. As it is, however, this simple background subtraction algorithm offers little advantage over the frame-by-frame differencing technique described above. That is, it may still provide false positives for searches related to the AOI. This is due to the way in which the search would be conducted in a system utilizing this technique would proceed. In particular, in searches performed on systems utilizing simple background subtraction, the search for changes in the AOI would return all instances where a pixel changes and that change is not a small or slow change (i.e., the pixel would be classified as a foreground pixel). This may return, however, all instances when, for example, a person walks in front of the AOI but all of these occurrences may not be of interest.