Various applications in robotic guidance, remote sensing or vehicle control require small, fast sensors for the processing of visual motion. Robust measurement of velocity in real time is difficult but necessary if a system has to operate in dynamic environments. Parallel processing at each image location is required for handling the large volume of incoming irradiance data from the optical sensors. Ideally, the processed output should neither depend on irradiance nor on contrast. However, because physical systems are always subject to the presence of noise, the output is a function of these parameters.
Previous attempts at implementing one-dimensional and two-dimensional motion sensors can generally be divided into two main categories, namely, the gradient technique and correspondence technique. Gradient schemes extract the velocity of an image feature from the ratio of temporal and spatial derivatives of its brightness or discrete approximations thereof. Correspondence methods measure motion by comparing the positions of a spatial pattern at different times (spatial correspondence), or by comparing the times of occurrence of a temporal pattern at different positions (temporal correspondence). Digital implementations of correspondence techniques typically use the first approach, where the times of sampling are clocked and the pixel displacements of the spatial pattern between these times are variable. Most analog implementations and well-understood biological systems use the second approach, where the pixel displacements are fixed and the times of occurrence of the temporal patterns at those pixels are variable.
However, existing circuits based on either of these approaches do not provide a signal that unambiguously encodes velocity, independent of image brightness and contrast over typical ranges encountered in natural scenes. Many of these existing circuits detect edges utilizing a predetermined contrast threshold such that an edge with a contrast level that is below the threshold is disregarded, and an edge with a contrast level that is above the threshold is assumed to be a sharp edge. There are several disadvantages in explicit thresholding. First, by disregarding images with contrast levels that are below a predetermined minimum, valuable information regarding the images is lost. Second, the threshold level has to be changed if the environment or lighting changes, e.g., if one moves outdoors from an indoor environment. Thus, the scheme is not robust, is sensitive to offsets, and requires parameter values to be in exactly the right range.
Many of these existing circuits also have an output versus velocity curve that is not monotonous; the curve has a maximum at some optimal velocity and decreases on either side of the optimal velocity. Consequently, any given output that is not at the maximum, corresponds to two velocity values and is ambiguous. One such circuit is that described by T. Delbruck in Silicon Retina with Correlation-Based, Velocity-Tuned Pixels, IEEE TRANS. NEURAL NETWORKS, Vol. 4, p. 529-541, 1993.
An early attempt based on the gradient technique is described by J. Tanner and C. Mead, in An Integrated Analog Optical Motion Sensor, VLSI SIGNAL PROCESSING 1, 59-76 (S. Y. Kung, Ed., IEEE Press, 1986). This circuit estimates uniform velocity in two dimensions, corresponding to global translation of a rigid object space relative to the sensor. The outputs of each pixel were made to influence the global estimates of the velocity vector components in proportion to their deviation from them and to their confidence levels. This strategy was employed to reduce offset effects of individual pixels by averaging. However, the circuit only worked with high-contrast edges, and even then showed poor performance. This result was mainly due to the discrepancy between the high-precision requirement of the algorithm and the low precision of the analog circuitry.
A sensor array based on the correspondence technique is described by T. Horiuchi, J. Lazzaro, A. Moore and C. Koch, in A delay line based motion detection chip, ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 3, 406-12 (Morgan Kaufmnan 1991). In this approach, within each pixel, a voltage pulse was triggered in response to a quickly-increasing irradiance level, identified as a dark-bright edge. Pulses from adjacent pixels were sent through two delay lines from opposite directions. Their meeting point, as a measure of their relative timing, was the cue used to estimate velocity. Thus, each pair of adjacent pixels provided a 1 D velocity vector for each detected edge. The circuit worked robustly down to low irradiance levels and contrasts under D.C. lighting conditions. A.C. incandescent lighting, however, caused spurious edges to be detected at the flicker rate of 120 Hz. This problem could only be alleviated by using additional filtering circuitry. Other drawbacks of the system were the limited detectable velocity range for a given delay setting, and the large area consumption of the delay lines.
A class of chips based on other biologically-inspired correspondence techniques uses elements tuned to have maximum response to a certain velocity. In one dimension, such elements usually determine the direction of non-optimal velocities as well, but they do not unambiguously encode the speed. At least two such cells, tuned to different velocities, must be used to extract the speed. In two dimensions, the direction of non-optimal velocity cannot be determined either; this is because it is interrelated with the magnitude of the velocity components measured along different directions.
For a robust estimate of a single velocity, a population of differently tuned cells is required. Existing systems based on this approach occupy large silicon areas. In addition, such existing systems exhibit a variety of problems. The velocity-response curve of a chip described by R. G. Benson and T. Delbuck, in Direction Selective Silicon Retina That Uses Null Inhibition, ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 4, 756-763 (Morgan Kaufman 1991), based on a computational model by H. B. Barlow and W. R. Levick, as described in The mechanism of directionally selective units in the rabbit's retina, J. PHYSIOL vol. 178, pp. 447-504 (1965), was not tunable and decreased at low contrasts. A second approach by R. Sarpeshkar, W. Bair and C. Koch, as described in Visual Motion Computation in Analog VLSI using Pulses, ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 5, 781-788 (Morgan Kaufman 1993) was insensitive to low-contrast edges.
Thus, the sensors in existing systems generally provide velocity outputs which are strongly dependent upon image brightness or contrast; or they are inoperable under A.C. incandescent lighting; or they take too much area to implement; or they are optimized to detect particular velocities and do not unambiguously discriminate between non-optimal velocities; or they are very sensitive to parameter settings and do not operate over large ranges.
Accordingly, there is a need for a velocity sensor which robustly and unambiguously encodes the velocity of a moving stimulus over a large range of brightnesses, contrasts and velocities, that is compact and insensitive to offsets and variations in circuit parameters.