The usefulness of video surveillance systems is becoming increasingly acknowledged as the demand for enhanced safety has increased. Areas commonly covered by such systems, include, for example, monitoring of harbors, airports, bridges, power plants, parking garages, public spaces, and other high-value assets. Traditionally, such camera networks require a labor-intensive deployment and monitoring by human security personnel. Human-monitored systems are, in general, relatively costly and prone to human error. For these reasons, the development of technology to automate the deployment, calibration, and monitoring of such systems has become increasingly important in the field of video surveillance.
For example, in automated video surveillance of sensitive infrastructures, it is always desirable to detect and alarm in the event of intrusion. To perform such a task reliably, it is often helpful to classify and track detected objects in an effort to discern from their actions and movements whether they pose an actual threat. Detecting and tracking an object are not easy tasks, however. Those functions require powerful video analytics and complex algorithms supporting those analytics. This often requires determining which portions of a video or image sequence are background and which are foreground, and then detecting the object in the foreground. Object detection is further complicated when the camera imaging the target moves, either because it is mounted to something which is mobile or because the camera is monitoring a wide field of view by a step-and-stare method of camera movement. Autonomous lock-on-target tracking fills an ever-important requirement in video surveillance of critical infrastructures where a preferred target is continually monitored without interruption until it is attended to in commensurate with the prevailing security policy. In comparison to tracking with fixed video cameras, autonomous lock-on-target tracking offers the advantage of extending the camera field of view without compromising desired resolution, through re-orientation of the camera and resort to appropriate magnification.
Other difficulties exist. Generally, video surveillance systems are unable to determine the actual size of an object, which can make threat detection even more difficult. With actual size detection, benign objects can be better differentiated from real threats. Moreover, the kinematics of an object, such as its velocity and acceleration (from which momentum can be estimated), are much more difficult to analyze when real size is unknown. Additionally, geo-referencing demands the existence of landmark-rich scenes which may not be available in many instances, such as in the surveillance of the ports, harbors, airspace, or when a site is being remotely—and perhaps covertly—monitored, and it is not feasible to introduce synthetic landmarks into the scene. An improved system and method for tracking a target is needed.