1. Field of the Invention
The present invention relates generally to visual encoding in a computerized processing system, and more particularly in one exemplary aspect to a computer vision apparatus and methods of encoding of visual information for vision prosthetic devices.
2. Description of Related Art
In a vertebrate vision system, the retina is a light sensitive tissue lining the inner surface of the eye. Light falling upon the retina triggers nerve impulses that are sent to visual centers of the brain via fibers of the optic nerve. The optic fibers from the retina to the thalamus represent a visual information bottleneck. The retina, consisting of millions of photoreceptors, is coupled to millions of neurons in the primary visual cortex to process visual information, yet there are only hundreds of thousands of retinal ganglion cells giving rise to the optic fibers that connect retina to the brain.
When retinal photoreceptors are damaged, a retinal prosthetic device (e.g., a semiconductor chip) is utilized to encode visual information, and transmit it to the retinal ganglion cells (RGC) via stimulating electrodes. In this case, the number of electrodes (typically limited to tens or hundreds per chip) acts as the information bottleneck that limits the transmission bandwidth and hence the resolution of the perceived image. Presently available retinal prosthetic devices use a small number of individual light sensors directly coupled to the RGCs via individual stimulating electrodes. The sensors are typically arranged into a square pattern (e.g., 3×3 pattern in the exemplary retinal prosthetic manufactured by Second Sight Medical Products, Inc.), and are configured to operate independently from one another by generating electrical pulses in the electrodes in response to light stimulation. The electric pulses evoke firings of the RGCs to mimic their natural firing patterns. It is an open question on what is the optimal firing pattern of RGCs to encode visual signals in a retinal prosthetic device.
Existing models used for retinal prosthetic signal encoding utilize rate encoding; the RGCs are stimulated with pulses of currents of various amplitudes or durations (or waveforms) so to make RGCs fire with various firing rates. This is in line with the common neuroscience theory that the frequency of random firing of retinal ganglion cells (and not the precise timing of pulses) is used to transmit the information to the brain (see, e.g., Field, G.; Chichilnisky, E. Information Processing in the Primate Retina: Circuitry and Coding. Annual Review of Neuroscience, 2007, 30(1), 1-30). In another existing approach (see Van Rullen R.; Thorpe, S. Rate Coding versus temporal order coding: What the Retinal ganglion cells tell the visual cortex. Neural computation, 2001, 13, 1255-1283), a coding scheme is suggested where each retinal ganglion cell encodes information into pulse timing (or latency) that is measured with respect to a reference timing signal (e.g., the onset of the image). Here, the RGC with the strongest signal fires first, the RGC with the second strongest signal fires next, and so on. Each RGC fires only once.
In both cases (i.e., rate coding and latency coding), each RGC encodes the photoreceptor signal (or other features of the image) into a single analog value (rate or latency), and RGCs encode their values independently. That is, the message transmitted along one RGC is the same regardless of the activities of the other RGCs.
In a different approach described in e.g., Meister, M. Multineuronal codes in retinal signaling. Proceedings of the National Academy of sciences. 1996, 93, 609-614, Meister, M; Berry M. J. II. The neural code of the retina. Neuron. 1999, 22, 435-450, and Schnitzer, M. J.; Meister, M.; Multineuronal Firing Patterns in the Signal from Eye to Brain. Neuron, 2003, 37, 499-511, encoding and multiplexing of visual information into patterns of synchronous pulses involving multiple cells is performed. The advantage of such neuronal code is higher information transmission capacity, as it allows for multiplexing of various photoreceptor signals into pulsed output of multiple RGCs. For example, 4 photoreceptor signals or features, 1, 2, 3, and 4, can be encoded into 3 RGC synchronous firings 1→(1,0,0), 2→(0,1,0), 3→(0,0,1), and 4→(1,1,0), where 1 represents the corresponding RGC firing, and 0 represents a quiescent state. When photoreceptors 1 and 3 are active, the output is a superposition 1+3→(1,0,1), resulting in multiplexing. However, when too many photoreceptors are active, the output consists of a synchronous barrage of pulses (e.g., (1,1,1)) and the information is lost.
All existing approaches have limited information transmission capacity, at least in part because they (i) do not fully utilize multichannel patterns of pulses to encode visual signals, and (ii) do not fully take advantage of the brain's ability to learn to decode such patterns. Accordingly, there is a salient need for a more efficient and scalable visual encoding solution that utilizes data compression at a retinal level prior to data transmission to the brain, in order to among other things, increase resolution capabilities of the retinal prosthetic devices.