The extraction of real-time velocity and noise-free detail from time-blurred frames of video has been inherently inaccurate. The problem is that current image detection technology is based upon raster, frame-at-a-time, or arrays of independent pixels for scene capture using light (x-ray, ultraviolet, infrared or other medium) integrating pixels. This process is limited in resolution by the number of pixels and their dimensions, and cannot avoid integrating noise into the detection process. Additionally, events occurring between frame captures are lost, and events occurring during exposures are blurred. Accordingly, attempts have been made to devise improved image detection and processing mechanisms with attention to mammalian visual systems. Since Mead and Mahowald first modeled human vision in silicon “retinas” in the mid 1970s, there have been many implementations to mimic the processing and functionality of retinal neural structures (e.g. Boahen, Delbrück, Koch, and VanRullen to name a few). Almost all of these implementations have treated the detection process as passive integration with gain control. A few (e.g. Prokopowicz, Landolt) have proposed using detector motion to enhance resolution or contrast, primarily to account for the fact that vision requires the image to move or else the scene fades away. These designs have not considered motion as a closed-loop de-noiser, tracker, and dynamic range extender. They have also not addressed the necessity of real-time continuous array element calibration, required for such a system to function, or the need to stabilize the image in any scene-memory plane for useful processing. In Landolt's design, there is no communication between pixels in the array (required for de-noising and calibration), and the nature of his detector elements (each being an independent voltage-controlled oscillator) acts as a noise source, masking real scene events.
For instance, J. C. Gillette (“Aliasing Reduction in Staring Infrared Imagers Utilizing SubPixel Techniques”) describes a method of uncontrolled micro-scanning for reducing aliased signal energy in a sequence of temporal image frames obtained by periodically sampling an image with a finite array of image detectors. Gillette takes a series of discrete low resolution samples of an image at a specified undersampling frequency, while “spatially oscillating” (actually just shifting) the detector between samples, thereby providing a sequence of static image frames, each having a subpixel offset relative to one another. By comparing the gray-scale values of successive image frames, for each image frame an estimate is calculated of each subpixel shift that occurs between successive image frames. Each image frame in the image sequence is then mapped onto a higher resolution grid, based on the respective estimated interframe displacement. If the estimated shift is the same for multiple frames, then the pixel values at the overlapping positions are averaged to suppress noise.
Since Gillette only calculates an estimate of the subpixel shifts, Gillette is unable to determine the portion of the magnitude of the pixel values actually attributable to the subpixel shifts. Estimation errors in subpixel shifts result in blurring and additional scene noise. This problem is compounded by the fact that Gillette averages the values of the pixels in successive frames that have the same estimated subpixel shift, thereby precluding removal of those aspects of the image frames not attributable to the subpixel shift. As such, the high resolution grid would include pixels whose values are not attributable to the subpixel shifts (e.g. resulted from detector noise).
Further, Gillette must estimate the subpixel shift in each frame, resulting in multiple frame delays for one high-resolution image. Also, the frame basis of the method, and the corresponding finite exposure times, result in motion blur in each frame as objects traverse the scene. Additionally, given the discrete sampling nature of the method, aliasing in time is possible if the sampling frequency is insufficient for the scene motion. Real motion in three dimensions in the scene also precludes actual high resolution frame registration as does block-matching which cannot take account of dense motion fields.
H. Ogmen (“Neural Network Architectures for Motion Perception and Elementary Motion Detection in the Fly Visual System”) describes a neural network model of motion detection in the fly visual system. Ogmen uses center-surround opponency as the basis for both directional and non-directional motion detection, both in the center field-of-view and the periphery. However, Ogmen only performs statistical neural filtering post-processing of the vision data, thereby integrating noise with the vision data, with the ultimate result of reduced signal detection.
For the foregoing reasons, there is a need for an improved electronic imaging system.
Several current technical papers handle the complexities of a dense retinomorphic detector array design required as a component of the present system. To avoid missing pixel-crossing events it is necessary to store events at each pixel (or equivalently at a memory address corresponding to the pixel or center of a center/surround). Designs of Address Event Representation (AER) chips exist and can also be modified to include time-stamps on each event. With sufficient memory, each pixel may record several crossing events in AER mode asynchronously, and a column raster data collection architecture can collect the data from each location for processing at a suitably frequent rate, or whenever a local event buffer becomes full. Ideally such processing would be provided by a distributed array of processors performing standard detection and de-noising algorithms with each processor handling a region of the detector array. It would further be advantageous to permit detector overlap so that motion events in the scene may be seamlessly passed from one processor to the next. AER retinomorphic array design and intelligent sensor design articles describing systems capable of being adapted to our application include “Point-to-Point Connectivity Between Neuromorphic Chips using Address Events”, Kwabena Boahen, IEEE Trans. On Circ. & Sys., Vol. 47, No. 5, 2000, “A Nyquist-Rate Pixel-Level ADC for CMOS Image Sensors”, David X. D. Yang et al, IEEE Jour. of Solid State Circ., Vol. 34, No. 3, March 1999, “A Foveated AER Imager Chip”, M. Azadmehr et al, University of Oslo, Norway, 2000, “Bump Circuits for Computing Similarity and Dissimilarity of Analog Voltages”, T. Delbrück, California Institute of Technology Computation and Neural Systems Program, CNS Memo 26, May 24, 1993, and “A Spike Based Learning Rule and its Implementation in Analog Hardware”, Ph.D. Thesis, ETH Zurich, Switzerland, 2000, P. Häfliger, http://www.ifi.uio.no/˜hafliger. The preceding articles are incorporated into the present application in their entirety by reference.