A conventional vision sensor captures a scene as a sequence of pictures or frames that are taken at a certain rate (e.g., a frame rate), where every picture element (e.g., pixel) within the boundary of a frame is captured in the frame. Pixel information that does not change from one frame to another frame is redundant information. Storing and processing redundant information wastes storage space, processing time, and battery power.
A DVS does not capture a scene in frames, but functions similarly to a human retina. That is, a DVS transmits only a change in a pixel's luminance (e.g., an event) at a particular location within a scene at the time of the event.
An output of a DVS is a stream of events, where each event is associated with a particular state, i.e., a location of the event within a camera array and a binary state indicating a positive or a negative change in the luminance of the associated event as compared to an immediately preceding state of the associated location.