Currently, there is an increasing demand for more sophisticated video surveillance systems. This demand is primarily motivated by organizations looking to use surveillance videos to not only enhance security capabilities, but also to increase situational awareness for improving their business operations. For example, retailers and customer-facing branch network operators utilize insights from videos to optimize their operations and better understand customer behaviors. In another example, airports, train stations and other mass transit operators monitor videos to facilitate human traffic flow, detect operational incidents, and use predictive modeling to optimize their operations.
With this rapid increase in the installation of video surveillance systems, existing teams of operators for the surveillance systems are unable to efficiently process and maintain the vast quantity of video data that is being generated, which may lead to a substantial amount of unseen video footage. As a result, most video surveillance installations are only used for forensic and evidential purposes after the fact. To maximize insights relevant to the video surveillance footage, human analysts are heavily utilized to monitor videos for activities such as suspicious behavior, object recognition, traffic monitoring, incident detection, face matching, safety alerts, anomaly detection, and crowd counting. This manual use of video processing is effectively inefficient and error prone.
Therefore, it may be desirable to have a system and method that take into account at least some of the issues discussed above, as well as possibly other issues.