Video cameras, such as Pan-Tilt-Zoom (PTZ) cameras, are omnipresent nowadays, mostly for surveillance purposes. The cameras capture more data (video content) than human viewers can typically process. Automatic analysis of video content is therefore often needed.
An essential step in the automatic processing of video content is the segmentation of video data into foreground objects and a background scene, or background. Such segmentation allows for further analysis, such as detection of specific foreground objects, or tracking of moving objects. Such further analysis may, for example, result in sending an alert to a security guard.
Two further goals of such analysis are of particular interest. First is the detection of abandoned objects, that is objects such as items of luggage that have been brought into the scene being monitored and left there. Second is the detection of object removal, that is noticing that an object that was previously considered part of the background has been taken out of the scene.
A common approach to foreground/background segmentation is background subtraction. For example, the median pixel value for a position in a scene may be compared against the current pixel value at that position. If the current pixel value is similar to the median pixel value, the pixel is considered to be belonging to the background, otherwise the pixel is considered to be belonging to a foreground object.
Using a background subtraction approach, abandoned object events and object removal events have similar properties. Both appear as a region where the scene is different to the “remembered background”, but is otherwise not changing. This is the case whether the “remembered background” is represented by a reference frame created when the system is initialised, by a collection of stored mode models corresponding to visual elements in the scene, or by some other means. It is often advantageous to use a common technique to detect when either of these events have occurred, and possibly raise an alert.
There are several challenges for such approaches. One is to make the approach robust to temporary occlusion of the object, for example if someone walks in front of an abandoned item of luggage. Another is to distinguish truly stationary objects from objects that are moving slowly or intermittently. Many techniques are able to handle one of these cases but not the other—the more sensitive a system is to changes in a slowly moving object, the more likely it is to lose track of a stationary object due to occlusion or noise, and vice versa.