Defined spaces may include multiple video cameras and provide multiple video feeds from different locations and points of view.
Some modern video analysis techniques may implement computer vision technology that enables automatic detection of objects in video data by a machine rather than relying on a human. In these implementations, the video analysis technique may include a specific detector that may be implemented for identifying a category of object (e.g., instance level detection) within video data. In more advanced implementations, for a single computer vision task, such as object detection, pose estimation, or scene segmentation, a general model for the single computer vision task may be implemented for accomplishing the discrete computer vision tasks. While such implementations may function to enable automated detections within video data, the discrete detection and analysis method fails to provide comprehensible and actionable detections.
Thus, there is a need in the computer vision and security fields to create a new and useful image data analysis and event detection system for intelligently detecting events of interest and providing a comprehensive interpretation of the detected events. The embodiments of the present application provide such new and useful systems and methods.