Video surveillance produces a large amount of continuous video data over the course of hours, days, and even months. Such video data includes many long and uneventful portions that are of no significance or interest to a reviewer. In some existing video surveillance systems, motion detection is used to trigger alerts or video recording. However, using motion detection as the only means for selecting video segments for user review may still produce too many video segments that are of no interest to the reviewer. For example, some detected motions are generated by normal activities that routinely occur at the monitored location, and it is tedious and time consuming for a reviewer to manually scan through all of the normal activities recorded on video to identify a small number of activities that warrant special attention. In addition, when the sensitivity of the motion detection is set too high for the location being monitored, trivial movements (e.g., movements of tree leaves, shifting of the sunlight, etc.) can account for a large amount of video being recorded and/or reviewed. On the other hand, when the sensitivity of the motion detection is set too low for the location being monitored, the surveillance system may fail to record and present video data on some important and useful events.
It is a challenge to accurately identify and categorize meaningful segments of a video stream, and to convey this information to a user in an efficient, intuitive, and convenient manner. Human-friendly techniques for discovering, categorizing, and notifying users of events of interest are in great need.