The present invention relates to a system using a cross-correlation technique for automatically tracking a relatively moving scene viewed by a video sensor by storing elements from a frame of a video signal to establish a reference frame and comparing elements from a subsequent frame of the video signal with the stored reference frame to derive signals indicating the direction and angular distance of scene relative movement.
Video tracking systems generally comprise a gimbaled video sensing means, such as a television camera, included within an overall feedback loop and controlled by azimuth and elevation servos such that the television camera remains directed at a relatively moving scene. The azimuth and elevation servos may be controlled either manually using for example a joystick, or automatically in response to the output of an automatic video tracker. The output signals of automatic video trackers are known in the servomechanism art as error signals, and indicate to the servos the direction and velocity of movement required to aim the camera exactly at a target.
One particular use for such a system is as an airborne video tracker (mounted on a gyro-stabilized platform) for stablizing a television image of a relatively moving target for example a ship, such that identifying indicia including numbers may be more easily recognized.
One type of video tracker is a cross-correlation tracker. In a typical cross-correlation tracker, a frame of incoming video data is stored as a reference. As subsequent video frame data arrives, the cross-correlation function of the reference frame with the subsequent frame data shifted into a large (for example thirty-two) number of positions is computed. The position, both horizontal and vertical, of greatest correlation is then detected and used as the position error feedback signal. It will be appreciated that the terms video signal or video data as used herein have their usual meaning of a signal which sequentially represents picture elements (pixels) in a field of view taken in a prearranged order which repeats itself, each sequence of pixels being termed a frame.
As is known, the value or magnitude of a cross-correlation function may be determined by relatively shifting two complete frames of scene data and multiplying each picture element (pixel) value in one frame by the corresponding pixel value in the other scene, and then integrating the product over the area of the complete scene. The term "shift" as employed herein means either a positional shift or an equivalent time shift or delay. It will be appreciated that with a conventional television raster scan, pixel data are rapidly generated in a serial stream and appropriate time delays may conveniently employed to in effect produce equivalent positional displacements. Accordingly, the term "shift" is for convenience herein employed to denote either of these equivalent shift types.
In one particular implementation, a three-dimensional reference of the input video signal is stored, and the two-dimensional correlation function is calculated for many points in either the time domain or in the frequency domain using a fast Fourier transform method. Use of a fast Fourier transform scheme to find the relative positions of maximum correlation reduces somewhat the amount of hardware required.
Patented prior art examples of the general type of cross-correlation tracker described above are U.S. Pat. Nos. 3,632,865--Haskell et al; 3,700,801--Dougherty (employing a Fourier transform); 3,828,122--McPhee et al; 3,829,614--Ahlbom et al; and 3,955,046--Ingham et al.
One significant disadvantage shared by these typical prior art cross correlation video trackers is that a fairly large amount of hardware is required to rapidly calculate the values of the plurality of correlation functions, precluding the use of the technique for most tracking requirements. Depending upon the precise arrangement, the number of calculations required may be even so great as to preclude operation in real time. A further disadvantage and one related to performance, is that the output error signal tends to jump in discrete steps corresponding to the individual shifted positions of the cross-correlation calculations, resulting in a "bang-bang" servo control loop.
Another approach to correlation tracking is disclosed in U.S. Pat. No. 3,943,277--Everly et al wherein positional error signals are derived by taking the analog difference between the integrated outputs of a pair of coincidence circuits which compare video scene elements stored in a recirculating shift register with elements of present video. One set of coincidence circuits compares each present element with shift register stored elements above and below the present element, and another set of coincidence circuits compares each present element with shift register stored elements on both sides.
In addition to correlation type video trackers, there are a number of other video tracker types which will now briefly be mentioned. Single edge trackers require a signal to noise ratio of better than 6:1, and require the operator to manually place a small tracking window on the edge of a target. Once the track function is initiated, the window tends to wander around the edge of the target, resulting in noise to the servo-gimbal. Since the gate is small (to reduce noise), it is easy for the tracker to loose the edge and to break lock. The edge tracker has a very low probability of holding onto an edge during field view changes.
Edge boundary trackers solve the edge wander problem of the simple edge tracker. They have a tracking gate which can be designed to expand automatically and enclose a target. This tracking technique requires an enclosed boundary target to operate properly, and therefore has a relatively poor chance of maintaining lock through a field of view change.
Total scene centroid trackers operate at a much lower signal-to-noise ratio compared to the edge trackers. The centroid tracker works well until the detected target fills the field of view of the camera. Once the detected target fills the field of view of the camera, the tracker causes the television camera to move away from the details of the target toward the center of the target mass. For example, where the target details of interest are the numbers on a ship, once the field of view is filled, the television camera tends to move towards the main superstructure.
A gated centroid tracker could track an enclosed high light portion of a target and track it even after the target exceeds the field of view of the camera. However, gated centroid trackers have the same likelihood of breaking block when the field of view changes as do the edge boundary trackers.
The most commonly used tracker is the "split gate" tracker. It can track a large, slow moving target at signal to noise ratios of 1:1. However, this low noise performance requires a manually controlled gate size. When automatic gate size is added, the noise performance is less dramatic and system complexity is increased. The loop response of these trackers is a function of the signal to noise ratio, and of the target shape. For small targets, the loop response may be insufficient to maintain lock. These trackers are also poor at maintaining lock during field of view changes.
Patented prior art examples of these and other video trackers are U.S. Pat. Nos. 3,733,434--Weinstein; 3,769,456--Woolfson; 3,823,261--Bolsey; 3,890,462--Limb et al; 3,903,361--Alpers; 3,932,703--Bolsey; 3,950,611--Callis et al; 3,953,670--Prince; 3,988,534--Sacks; 4,004,083--Norem; 4,034,208--Vaeth et al; 4,053,929--Collins, III et al; and 4,060,830--Woolfson.
Accordingly, it is a general object of the invention to provide a system to automatically track a complex scene by storing the scene during one video frame and comparing subsequent frames of the scene with the stored reference to detect the direction and magnitude of motion.
It is a further object of the invention to provide a correlation type tracker which is relatively free of undue complexity, which has the capability of operating in real time, and which is effective.
Another object of the invention is to provide a video tracker which is well suited to the task of stabilizing a complex scene viewed from a moving vehicle, such as an aircraft.
It is still another object to provide such a correlation tracker which is particularly well suited to the requirements of avionic equipment, such as small size and weight, operability over a wide temperature range, and high reliability.
It is still another object of the invention to provide such a video tracker which provides an error signal output which varies in a fairly continuous manner, as opposed to the step-wise manner of typical prior art cross-correlation video trackers.