Surveillance System
A surveillance system acquires surveillance signals from an environment in which the system operates. The surveillance signals can include images, video, audio and other sensor data. The surveillance signals are used to detect and identify events and objects, e.g., people, in the environment.
As shown in FIG. 1, a typically prior art surveillance system 10 includes a distributed network of sensor 11 connected to a centralized control unit 12 via a network 13. The sensor network 11 can include passive and active sensors, such as motion sensors, door sensors, heat sensors, fixed cameras and pan-tilt-zoom (PTZ) cameras. The control unit 12 includes display devices, e.g., TV monitors, bulk storage devices such as VCRs, and control hardware. The control unit can process, display and store sensor data acquired by the sensor network 11. The control unit can also be involved in the operation of the active sensors of the sensor network. The network 13 can use an internet protocol (IP).
It is desired to measure the performance of a surveillance system, particularly where the control of the sensors is automated.
Scheduling
The scheduling of active sensors, such as the PTZ cameras, impacts the performance of surveillance systems. A number of scheduling policies are known. However, different scheduling policies can perform differently with respect to the performance goals and structure of the surveillance system. Thus, it is important to be able to measure the performance of surveillance systems quantitatively with different scheduling policies.
Surveillance System Performance
Typically, automated surveillance systems have been evaluated only with respect to their component processes, such as image-based object tracking. For example, one can evaluate the performance of moving-object tracking under varying conditions, including indoor/outdoor, varying weather conditions and varying cameras/viewpoints. Standard data sets are available to evaluate and compare the performance of tracking processes. Image analysis procedures, such as object classification and behavior analysis have also been tested and evaluated. However, because not all surveillance systems use these functions and because there is no standard of performance measure, that approach has limited utility.
Scheduling policies have been evaluated for routing a packet in a computer or communications network or scheduling a job in multitasking computers. Each packet has a deadline and each class of packets has an associated weight, and the goal is to minimize the weighted loss due to dropped packets (a packet is dropped if it is not served by the router before its deadline). However, in those applications, the serving time usually depends only upon the server, whereas in the surveillance case it depends upon the object itself. In the context of a video surveillance system, “packets” correspond to objects, e.g., people, which have different serving times based on their location, motion, and distance to the cameras. A “dropped packet” in a PTZ-based video surveillance system corresponds to an object departing a site before being observed at a high resolution by a PTZ camera. As a result, each object may have an estimated deadline corresponding to the time it is expected to depart the site. Thus, computer-oriented or network-oriented scheduling evaluation cannot directly be applied to the surveillance problem.
Surveillance scheduling policy can also be formulated as a kinetic traveling salesperson problem. A solution can be approximated by iteratively solving time-dependent orienteering problems. However, that would require the assumption that the paths of surveillance targets are known, or predictable with constant velocity and linear paths, which is unrealistic in practical applications. Moreover, it would require the assumption that the motion of a person being observed by a PTZ camera is negligible, which is not true if the observation time, or “attention interval,” is long enough.
The ODViS system supports research in tracking video surveillance. That system provides researchers the ability to prototype tracking and event recognition techniques using a graphical interface, C. Jaynes, S. Webb, R. Steele, and Q. Xiong, “An open development environment for evaluation of video surveillance systems,” IEEE Workshop on Performance Analysis of Video Surveillance and Tracking (PETS '2002), in conjunction with ECCV, June 2002. That system operates on standard data sets for surveillance systems, e.g., the various standard PETS video, J. Ferryman. “Performance evaluation of tracking and surveillance,” Empirical Evaluation Methods in Computer Vision, December 2001.
Another method measures image quality for surveillance applications using image fine structure and local image statistics, e.g., noise, contrast (blur vs. sharpness), color information, and clipping, Kyungnam Kim and Larry S. Davis, “A fine-structure image/video quality measure using local statistics,” ICIP, pages pp. 3535-3538, 2004. That method only operates on real video acquired by surveillance cameras and only evaluates image quality. That method makes no assessment of what is going in the underlying content of the video and the particular task that is being performed.
Virtual Surveillance
A system for generating videos of a virtual reality scene is described by W. Shao and D. Terzopoulos, “Autonomous pedestrians,” Proc. ACM SIGGRAPH, Eurographics Symposium on Computer Animation, pp. 19-28, July 2005. That system uses a hierarchical model to simulate a single large-scale environment (Pennsylvania Station in New York City), and an autonomous pedestrian model. Surveillance issues are not considered. That simulator was later extended to include a human operated sensor network for surveillance simulation, F. Qureshi and D. Terzopoulos, “Towards intelligent camera networks: A virtual vision approach,” Proc. The Second Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, October 2005.
In later work, camera scheduling policies are described, still for the same single Pennsylvania station environment, F. Z. Qureshi and D. Terzopoulos, “Surveillance camera scheduling: A virtual vision approach,” ACM International Workshop on Video Surveillance and Sensor Networks, 2005. There, the camera controller is modeled as an augmented finite state machine. In that work, the train station is populated with various number of pedestrians. Then, that method determines whether different scheduling strategies detect the pedestrians or not. They do not describe generalized quantitative performance metrics. Their performance measurement is specific for the single task of active cameras viewing each target exactly once.
It is desired to provide a general quantitative performance metric that can be applied to any surveillance systems, i.e., surveillance systems with networks of fixed cameras, manually controlled active cameras, automatically controlled fixed and active cameras, independent of post-acquisition processing steps, and that can be specialized to account for various surveillance goals.