1. Field of the Invention
The present invention generally relates to surveillance systems and, more particularly, relates to a method and apparatus for providing immersive surveillance.
2. Description of the Related Art
The objective of a surveillance or visualization display is typically to allow a user to monitor or observe a scene with full awareness of the situation within the scene. Typical surveillance or visualization systems present video to a user from more than one camera on a single display. Such a display allows the user to observe different parts of the scene, or to observe the same part of the scene from different viewpoints. A typical surveillance display, for example, has 16 videos of a scene shown in a 4 by 4 grid on a monitor. Each video is usually labeled by a fixed textual annotation displayed under the video segment to identify the image. For example, the text “lobby” or “front entrance” may be shown. If an event deserving attention is observed in one particular video, then the label can be used to locate the activity in the scene.
This approach for surveillance and visualization has been used widely for many years. However, there are some fundamental problems with this approach.
First, if an event deserving attention occurs in one particular video, then the user does not know how the activity relates to other locations in the scene without referring to or remembering a map or 3D model of the scene. For example, if activity is observed near “elevator 1” and the user knows that a guard is currently at “stairwell 5”, then without a map or 3D model, the user will not know if the guard is very close or very far from the activity in order to intervene. The process of referring to a map or 3D model either on paper or on a computer is typically time-consuming and error-prone since a human is involved in the association of the camera view to the map or model. The process of remembering a map or 3D model is also error-prone, and typically impossible when large numbers of cameras are used or if the site is large.
Second, if an event deserving attention occurs in a video and then the activity moves out of the field of view of the video in one particular direction, then there are only two ways that the user can predict the new location of the activity. First, the user can remember the orientation (pointing direction) of the camera with respect to a fixed coordinate system (for example, the compass directions). Second, the user can recognize landmarks in the video and can use the landmarks to determine the approximate orientation of the camera with respect to a fixed coordinate system by remembering or referring to a map or 3D model of the scene. These two methods of predicting the new location of activity are error-prone, since typically the views from cameras are shown with respect to many different arbitrary coordinate systems that vary widely from camera to camera depending on how each camera was mounted during installation. As more cameras are added to the system, the more difficult it is for the user to remember their orientations or to recognize the landmarks in the scene. In addition, some parts of the scenes may contain no distinguishing landmarks at all.
Third, as more videos are displayed on a screen, then the resolution of each video has to be reduced in order to fit them into a display. This makes it difficult to observe the details of any event deserving attention in the video. The current solution to this problem is either to have an additional display that shows one selected video at high resolution, or to switch a single display between the view showing multiple reduced resolution videos and a view showing a single video at high resolution. However, the problem with this approach is that the user will miss any activity that may be occurring in other videos while they are focusing on the single high-resolution video.
Therefore, there is a need in the art for a method and apparatus for providing an immersive surveillance system that provides a user with a three-dimensional contextual view of a scene.