Computer vision is conventionally employed to extract meaningful information from video data such as optical flow, tracking, face recognition, object recognition, etc. The video data is acquired using regular frame-based camera sensors where one image is integrated over an exposure period and then read out completely. However, this principle includes various shortcomings. For example, depending on the lighting conditions, long exposure times need to be used to avoid noisy images but lead to motion blur artifacts which often complicate or even make computer vision impossible. In another example, the frame-based readout leads to substantial latency from image acquisition until the image can be processed. This leads to problems in applications where systems require relatively fast reactions on events in images (e.g., robotics).
Another sensor used to acquire video data is a silicon retina (SR) sensor developed by the Institute for Neuroinformatics of the University of Zurich. The SR sensor utilizes a fundamentally different principle from traditional cameras. Specifically, the SR sensor is event-based, asynchronous, and registers relative changes in intensity rather than attempting to determine absolute brightness values. Thus, instead of returning a color value per pixel (as is the case with traditional cameras), a measurement of the SR sensor represents a signal spike whenever a change in light intensity is detected that exceeds a predetermined threshold. These spikes are forwarded to the SR's output asynchronously and with a precise timestamp and the signaling pixel's coordinate. Rather than reading pictures, a client of the SR sensor receives a stream of events indicating at what point in time a certain pixel experiences a significant rise or fall in light intensity. The transmission of local changes is substantially similar to the way biological retinas transmit visual signals to the brain.
The client of the SR sensor is not required to wait for a full frame to be exposed to realize important changes in the scene. Since only changes are registered, the client is also not required to spend processing power on separating and discarding redundant information about parts of the scene's image that remain static. Due to the very low latency with changes that are registered, it is possible for the client to react even more quickly. While high-speed video cameras can reach recording frame rates of 2,000 full frames per second and beyond at considerable bandwidth costs and high requirements for lighting, an SR sensor can register tens of thousands of events per pixel and second even at very low light. However, the SR sensor generates computational data only. That is, the data registered by the SR sensor does not have practical use to a human observer since the SR sensor does not yield a pictorial representation of the world. The SR sensor has mostly been used directly in computer vision contexts.
Accordingly, there is a need for incorporating the features of the SR sensor with a regular frame-based camera sensor.