Video cameras, hereinafter called cameras, are commonly used for security surveillance. A camera provides video data of whatever is within its field of view (FOV). In a typical surveillance scenario, a network of cameras is deployed for the comprehensive coverage of an area. The video data obtained by such a network may then be observed in real time and/or reviewed later by a human operator and/or by an automated system.
The situational awareness of the observing entity of the video data of the various cameras may be important. Situational awareness is particularly supported by high-precision camera calibration, especially by the location of each camera. Such knowledge may be important in dense, urban surveillance areas with narrow, winding streets and busy traffic circulation which involve pedestrians and various vehicles.
Camera calibration within a geo-referenced coordinate system refers to the process of obtaining intrinsic and extrinsic camera parameters. In the following, these parameters are referred to as camera calibration parameters. Intrinsic camera calibration parameters are, for example, focal length, image format, principal point, and lens distortion; extrinsic camera parameters are, for instance, the geo-positional location and orientation. Camera calibration is also known as camera resectioning.
In existing surveillance networks, however, cameras are rarely sufficiently calibrated. Often, the camera locations are only roughly associated with street names or with estimates from the data of a Global Positioning System (GPS). The accuracy can deviate for more than a hundred meters from an actual geo-positional location.
A camera can be calibrated by placing an object into its FOV. Knowing the geo-positional information of the object and knowing some reference points of the object (in the picture taken by the camera) allows for calculating the camera calibration parameters.
The following references address the calibration of multiple cameras:                Chen, Davis, and Slusallek, “Wide Area Camera Calibration Using Virtual Calibration Objects”. In: Proceedings of Computer Vision and Pattern Recognition, pp. 520-527, 2000.        Rahimi, Dunagan, and Darrell, “Simultaneous Calibration and Tracking with a Network of Non-Overlapping Sensors”. In: Proceedings of Computer Vision and Pattern Recognition, pp. 187-194, 2004.        
In a network with multiple cameras, an object can be moved through the area and its respective geo-positional information can be logged. The geo-positional information has then to be matched to the corresponding video data. The matching can be achieved by, for example, a clock which is synchronized across the cameras and the logged geo-positional information. In large surveillance networks, however, this synchronization is rarely given. Hence, a manual readjustment has to be performed. This can be a tedious and expensive task, if possible at all.