Areas such as traffic intersections, public places, and spaces within buildings are frequently monitored by video camera for a variety of reasons. For example, for security reasons, businesses, such as banks or stores, may monitor particular areas of a building interior to deter illegal activity and to collect information for law enforcement when violations of the law occur. Another application of video monitoring may be to analyze the behavior of mobile objects in a physical area. For example, traffic cameras may be installed to monitor the movement of vehicles in a particular intersection or to monitor the activity of patrons in a museum.
In any of these examples, using existing technologies, the recorded video data would need to be manually reviewed and analyzed for its intended purpose. For example, if it is determined that a theft has occurred in a store, it may be necessary to manually review video footage of a particular area to determine when and how the theft occurred. If the theft could have occurred within a period of days or weeks, such a manual review process could consume tens or hundreds of person-hours.
Similarly, if the purpose of monitoring an intersection is to identify frequent traffic patterns, manual review may be required to identify such frequent patterns from the video data. That manual review may involve analyzing many hours of recorded video footage of the intersection, compiling large numbers of observations about the behavior of individual vehicles, and then analyzing those compiled observations to identify frequent traffic patterns, if any.
Such manual review processes are not only tedious and time-consuming, but they are also subject to human error and inflexibility. For example, if human review of video footage does not result in the desired information being found—either due to oversight or incorrect criteria for the review—the review would likely need to be repeated all over again just to catch the missed observation or to use new criteria for the review.
Accordingly, there is a need for automated techniques for mining and analyzing video data to identify specific events and to detect overall frequent patterns or behaviors in a monitored physical area.