Video surveillance systems are common in commercial, industrial, and residential environments. A common surveillance activity is to keep track of people as they move from camera to camera, and in particular to keep track of important people or people exhibiting suspicious behavior. Security personnel need to identify activities of interest and determine interrelationships between activities in different video streams from multiple cameras at fixed locations. From these video streams, security personnel need to develop an understanding of the sequence of actions that led to or happened after a particular incident. For example, a video security system in an office building continuously records activity from multiple cameras. If an explosion occurred in the building, security personnel would be asked to analyze data from the video cameras to determine the cause of the explosion. This would require searching through hours of data from multiple cameras before the time of the explosion to determine the cause. For a video stream showing a person of interest from a main camera, other cameras into which a person may come into view are of interest to security personnel. These other cameras tend to be geographically near the main camera.
Further, large security installations can include dozens of security cameras. With the decreasing cost of video hardware, the number of video streams per installation is increasing. The limits of human attention and the number of video streams, however, constrain the cost efficiency and effectiveness of such systems. Further, it is often difficult to track activity between cameras because locations such as hallways in office buildings can look quite similar and do not indicate the spatial proximity of the cameras. Consequently, security personnel have great difficulty tracking activity across video streams. Hereinafter, the term “user” will be used instead of “security personnel” and includes but is not limited to security personnel.
Currently, identifying activity of interest within synchronized video streams from a set of security cameras is difficult due to the quantity of video, as well as the lack of authored metadata or indexing of the video streams. Currently, security video is normally observed and interacted with via a camera bank that shows available cameras. Current multi-channel video players generally have a bank of small video displays and a large video display. Users select cameras from the camera bank to track activity from one view to another. It is difficult for users to predict in which camera view a tracked person might appear after walking out of the main camera view. For many video players, the images in the camera bank tend to be small so that it is difficult for the users to locate and recognize a tracked person in those images.
What is needed is a system for monitoring video streams of a person moving in view of one camera to positions in view of other cameras. A way to select segments of video streams having the most activity is needed, as well as a way to select representative keyframes within these segments, where keyframes are frames or snap-shots of a point in time in the video streams. In particular, what is needed is a way to present video streams from a main camera along with video streams from other nearby cameras showing activity to facilitate the tracking of events of interest. A map is also needed to show a spatial view of the cameras, as well as video streams alongside the cameras on the map. In addition, what is needed is a way to present video streams from slightly before and after the time being viewed to aid users in determining where people came from and where they go to. Further, what is needed is way for users to browse video by quickly skipping to a different time in the same video stream or switching to another video stream to keep the activity in view. In addition, animation of the displays and map are needed to keep the user oriented when the user switches to another video stream.