In this application the following definitions are used:                The position of camera is its location in some three-dimensional coordinate system, typically the location on the surface of the earth and the elevation.        The pose of a camera is its three dimensional orientation with respect to the target it is imaging. The pose is measured in terms of a pan angle and a tilt angle, which define the direction of the straight line from the scene center, and the zoom, which is related to the field of view.        A target is an object, for example a vehicle or person, being tracked by a PTZ camera.        
When using a PTZ camera in a surveillance system to automatically track a target it is necessary to know the pose of the camera at all times in order for the system to be able to generate the most appropriate motion commands that control the camera's motion. For example: viewing an object advancing in the image to the right may mean at one configuration giving only a pan command, while in another configuration a tilt command would also be necessary.
When working with cameras mounted on moving platforms, both position and pose must be determined; however for fixed PTZ surveillance cameras, only the pose need be determined. For fixed PTZ surveillance cameras there are two cases to consider:                The PTZ camera does not support feedback and hence the pose must be calculated entirely.        The PTZ camera supports feedback (sending its pose over a communication wire). In this case, for reasons that will be discussed hereinbelow, the feedback is limited and hence in this case also, the algorithm needs to refine the pose as given by the feedback to estimate the exact pose of the camera in each frame.        
Many different methods are known in the prior art for determining the position and pose of cameras. The most common methods for determining pose begin by identifying a fixed reference mark or object in one of the frames of video images captured by the camera. The selected frame is referred to as the reference frame. The pose of the camera when taking the reference frame is measured or assigned arbitrary values. The same object is then located in the next frame (referred to as the current frame) and image processing techniques are used to calculate the motion of the object from the reference frame to the current frame. It is to be noted that the reference object does not move from frame to frame, but there is apparent motion caused by the motion of the camera. The calculated motion is used to estimate the pose of the camera when taking the current frame relative to the pose in the reference frame. An iterative process is now carried out in which the current frame is made the “reference” frame and a new “current” frame is selected. The motion of the object between the new reference and current frames is determined and used to calculate the pose of the camera when taking the new current frame. The process is continued on a frame by frame basis until the camera ceases tracking the target.
The major difficulty with this method is that any image motion estimation algorithm inherently contains some error. This error is added to the error of previous iterations and accumulates when repeating the procedure. Consequently, after a short while of tracking, the resulting pose estimation is very erroneous. The tracking quality degrades and even may fail, since the actual result of given tracking commands is different from the expected result.
In addition to the above source of accumulative error, cameras that support feedback suffer from a feedback limitation, which results from various factors including:                The feedback relies upon a mechanical gear, whose accuracy is limited by the number of teeth and the way the sensors pass on the information about the rotation of the gear to the user/client.        There are inaccuracies in the manufacturing and assembly of the gears/motors.        There might be backlash effects.        The mechanics of the camera and its PT platform are subject to temperature changes.        The feedback has a non-constant latency due to communication constraints, hence when the algorithm receives feedback information it is no longer valid.        
The conclusion that can be reached from this is that a good pose estimation algorithm with no accumulative error must be developed.
It is therefore a purpose of the present invention to provide a method for determining PTZ pose estimations which improves upon prior art methods by eliminating the accumulative error.
Further purposes and advantages of this invention will appear as the description proceeds.