This invention relates to an integrated sensor and analog signal processor for visual images, and in particular to an integrated sensor and processor which emulates the vertebrate retina in producing space-time derivative signals in response to image pixels.
While models have been proposed for the visual system, it is not possible to simulate enough cases to gain real confidence in the model, even on the most powerful computers. For this reason, one can not really understand visual processing, especially with respect to motion, until one succeeds in building a system that does visual processing in real time. Until recently there has not been a technology in which such fundamental synthetic investigations could be carried out. With the evoluation of high-density VLSI technology, a way has been discovered for these extremely important investigations to be done.
By far the most massive application of large-scale integrated circuits has been in digital systems. While analog integrated circuit techniques have developed along with digital techniques, no methods comparable to digital techniques exist for managing the complexity of extremely large analog systems. This invention presents not only a prototype vision system, but illustrates an approach to problems of this class.
A large fraction of the processing done in early vision systems of animals is connected with extracting motion events. The value of such processing is evident. Information is sent from the retina up the optic nerve to the brain by neural action potentials. Each nerve impulse corresponds to some significant event in the incoming image. If simple intensity encoding were used, pixels in the image would be sampled at some rate determined by the local intensity. Any change in intensity would be reflected as a change in pulse rate. The time such a change had occurred could only be determined to the time between pulses. In signal processing terms, the derivative information would have been "aliased away" by temporarily sampling the image. For this reason, optic nerve pulses sent from all but the most central part of the retina encode changes in intensity rather than the intensity itself. In this way, an individual nerve pulse corresponds to an important feature in the image moving over the particular place on the retina. Higher level correlation between events can then be reconstructed without loss of information due to temporal aliasing.
In artificial vision systems, a similar problem is encountered. A standard television camera samples any point in the image once every 1/30 second. An object can easily move many pixels between sample times. Information is in this way irreversibly aliased away. The present invention, like the human retina, uses an easy computation (taking a time derivative) to simplify solving the much harder correspondence problem (finding the point in a second image that corresponds to a given point in the first image).
The problem of machine vision can be enormously eased by extracting time-space derivative information before it is aliased away. In the present invention, an electronic replica of the vertebrate retina computes time-space derivative information in analog fashion. The overall structure of the retina is described, together with a set of detailed circuits (implemented in standard CMOS technology) that form a reasonably faithful model of certain processing that occurs in a mammalian retina.