Video cameras, such as Pan-Tilt-Zoom (PTZ) cameras, are omnipresent nowadays, and are often utilised for surveillance purposes. The cameras typically capture more data (video content) than human viewers can process. Automatic analysis of video content is therefore needed.
One step often used in the processing of video content is the segmentation of video data into foreground objects and a background scene, or background. Such segmentation allows for further analysis of the video content, such as detection of specific foreground objects, or tracking of moving objects. Such further analysis may, for example, result in sending an alert to a security guard, perhaps upon detection of a foreground object or tracking an object entering or leaving a predefined area of interest.
Two aspects of such analysis are of particular interest. First is the detection of abandoned objects. An example of an abandoned object is an object, such as an item of luggage, that has been brought into a scene that is being monitored over a sequence of video frames and wherein the object is subsequently left in the scene. Second is the detection of object removal. Object removal relates to detecting that an object, which was previously considered part of the background of a scene that is being monitored over a sequence of video frames, has been removed from the scene.
A common approach to foreground/background segmentation is background subtraction. For example, the median pixel value for a position in a scene over a sequence of video frames may be compared against the current pixel value at that position in a current frame of the sequence of video frames. If the current pixel value is similar to the median pixel value, the pixel is considered to belong to the background. Otherwise, if the current pixel value is not similar to the median pixel value, wherein the difference between the current pixel value and the median pixel value exceeds a threshold, the pixel is considered to belong to a foreground object.
Using a background subtraction approach, abandoned object events and object removal events have similar properties. Both abandoned object events and object removal events appear as a region of the scene that is different from a remembered background, but the region is otherwise not changing. It is advantageous to use a common technique to detect when either an abandoned object event or object removal event has occurred, and raise an alert.
A difficulty of using such an approach to detect abandoned objects and object removal events is that, to the camera, abandoned object events and object removal events appear indistinguishable from each other using only background subtraction. However, it is often desirable for a surveillance system to be able to draw the attention of an operator to such events, and to be able to give different alerts based on the type of event that has occurred. In some cases, abandoned object events may be of greater importance than object removal events, such as in a busy airport. In other cases, object removal events may be of greater importance than abandoned object events, such as in an art gallery.
It is possible to differentiate between the two events, to some degree, through a comparison of the boundary pixels of detected regions of change and a background model. For example, a strong boundary on a region of change would more likely indicate an abandoned object event, and a weak boundary would more likely indicate an object removal event. Drawbacks to these pixel-based methods are costs in terms of both high memory usage and costly computation time.
Thus, a need exists to provide an improved method and system for detecting abandoned object events and object removal events in the monitoring of video frames.