The background description provided herein is for the purpose of generally presenting the context of the invention. The subject matter discussed in the background of the invention section should not be assumed to be prior art merely as a result of its mention in the background of the invention section. Similarly, a problem mentioned in the background of the invention section or associated with the subject matter of the background of the invention section should not be assumed to have been previously recognized in the prior art. The subject matter in the background of the invention section merely represents different approaches, which in and of themselves may also be inventions. Work of the presently named inventors, to the extent it is described in the background of the invention section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the invention.
Cochlear implants are considered the most successful neural prostheses. They restore some hearing for more than 320,000 severe-to-profound deaf individuals by stimulating segments along the length of the tonotopically organized cochlear spiral ganglion. The devices include three essential components, the speech processor, the transcutaneous transmitter and receiver, and the cochlear implant array. The devices record and process sound signals and convert acoustic information of the sound signals into electric pulse trains that are used to directly stimulate the auditory nerve. Over the last two decades, the improvement of the coding strategies for acoustic information was the main contributor to the improvement of the cochlear implants. Still, room exists to advance the technology.
Communication describes the transfer of information between individuals, such as speech. Acoustic information is processed at a rapid speed by the brain, at about three to seven syllables per second (Edwards and Chang, 2013; Prather, 2013; Scott and McGettigan, 2013; Wang, 2013). Vocal cords are set in vibration by airflow. The vibrations are modulated by the actions of the larynx, pharynx, lips, teeth, tong, and the upper airway. Changes in frequency and intensity are used to code information in an acoustic signal. The listener uses the resulting acoustic patterns, or acoustic cues, to decode the information. Complex signal processing by the brain allows speech perception under challenging listening conditions. At the other end of the transmission line, the ear decodes the acoustic signal (Clark, 1995; Clark, 2003). The inner ear acts as a frequency analyzer and converts acoustically induced vibrations of the inner ear soft tissue structures into series of action on the auditory nerve (von Békésy, 1960; Dallos, 1973; Davis, 1983; Hudspeth, 1989 Oct. 5; Dallos, 1992; Dallos, 2003). The brain can use the information provided for communication. As indicated, acoustic information comprises of complex acoustic patterns that are constantly changing. According to the articulation different classes can be distinguished, vowels, semivowels, diphthongs, nasal consonants, stops, fricatives and affricatives. Several theories have been developed of how the information is processed, including the active theories such as the “Motor Theory” by Liberman, and the “Analysis-by-Synthesis Theory” by Stevens and Halle and the passive theories that emphasize speech perception the passive filtering of the acoustic signal by the listener. Moreover the “Quantal Theory”, and the “Action Theory” have been proposed.
Vocoders are systems that analyze, transmit and synthesize speech. They provided the basis for the development of cochlear implant speech processors. One of the early systems was presented by Dudley (Dudley, 1939). The system includes a set of bandpass filters, and the amplitude in each of the filters was measured continuously, as was the fundamental frequency of the speech. The responses from the bandpass filters were used to control the output of the system. About 2 to 4 filter bands were required for intelligible speech in quiet listening environments (Shannon, Fu, and Galvin, 2004). However, for more challenging listening conditions or for music perception more independent channels are required to code the acoustic information; 16 and more for speech in noise and 30 and more for music perception (Shannon et al., 2004). The number of electrode contacts in contemporary cochlear implants is related to the number of critical bands for optimal speech transmission, about 14 to 19 critical bands over the speech frequency range.
In initial attempts to encode the acoustic information, a single stimulation electrode was used (for a review see Clark, 2003). While this simple coding strategy provided the patients with information about syllables, words, phrases, and sentences, insufficient information was available to discriminate formants and their transitions. Single words could be recognized but understanding of running speech was not possible for the first implantees. Research following the first implantation of single channel (Djourno and Eyries, 1957) and a multichannel device (Doyle et al., 1963) showed that cochlear implants should be multichannel devices inserted into scala tympani (Clark, Dowell, et al., 1984; Clark, Tong, et al., 1984; House and Berliner, 1982; House and Edgerton, 1982; Simmons, Dent, and Van Compernolle, 1986; Simmons, Mathews, Walker, and White, 1979). Initial results were improved by emphasizing the mid and high frequency cues (Edgerton and Brimacombe, 1984). A similar system was implemented at Vienna (Hochmair et al., 1979). The system included gain compression, followed by frequency equalization. While single words could be recognized, open-set speech recognition was not possible in any of the single channel devices.
Multichannel coding strategies were developed to model the tonotopic organization of the cochlea. For the selection of number of channels, the results from the vocoders were used. However, early attempts to stimulate at all channels simultaneously resulted in unpredictable changes in loudness (Shannon, 1981). To avoid interaction between channels, electrical stimuli were presented as pulses and non-simultaneously at neighboring electrode contacts (Shannon, 1981, 1983, 1985). While early coding strategies suffered from channel interactions by stimulating at neighboring electrodes, novel strategies were tested to overcome existing limitations in stimulation strategies and resulted to the introduction of the continuous interleaved sampler (CIS) coding strategy (Lawson, Wilson, & Finley, 1993; Wilson, 1997; Wilson et al., 1991; Wilson, Finley, Lawson, Wolford, and Zerbi, 1993).
The continuous interleaved sampler (CIS) coding strategy is in widespread use amongst current cochlear implants (Wilson and Dorman, 2008). CIS processing utilizes band pass filters, then compress the envelope signals extracted from these filters to map the large dynamic range up to 100 dB to the smaller range of electrically evoked hearing, which is about 10 dB (Wilson and Dorman, 2008). The outputted trains of electrical pulses are then sent to tonotopically placed electrodes to mimic the frequency mapping of a normal cochlea (Wilson and Dorman, 2008; Flint, 2010). The amplitude of the transmitted pulse is determined by the amplitude of the original pulse from the acoustic signal (Flint, 2010). Cochlear implants seek to independently stimulate neuron sites to allow for the best speech perception, though studies suggest that no more than 4-8 independent sites can be stimulated in many electrode designs (Fishman, 1997; Wilson, 1997; Kiefer et al., 2000; Garnham et al., 2002). The CIS strategy attempts to avoid the issue of electrical interference and stimulate more independent areas through transmitting the pulse trains across electrodes in an interleaved non-simultaneous manner, such that there is a temporal offset between stimuli (Wilson et al., 1991 and Wilson and Dorman, 2008). Additionally, the brief pulses are transmitted at a high rate (typically about 1500 pulses/s), which allows for the preservation of temporal fine structure of the acoustic signal (Somek, 2006; Wilson and Dorman, 2008).
While the pattern of delivering the electrical pules at each contact of the cochlear implant electrode is on part of the coding strategy, the selection of the acoustic information to be presented constitutes the second part of the coding strategy.
The selection algorithms include Spectral Peak Extraction (SPEAK) coding and Advanced Combination Encoder (ACE) coding strategy and its variations. SPEAK sends the incoming signal through a bandpass filter, and then takes approximately 6 ms to scan the output of those frequency filters and selects for transmission to the cochlea the 6 filters with the most energy, that is the frequencies with the most amplitude or the highest spectral peaks (Somek, 2006). Electrodes are then stimulated in a basal to apical direction. About 6 to 8 electrodes are usually stimulated, but the more electrodes stimulated, the slower the rate of transmission of the outgoing signal. The Advanced Combination Encoder (ACE) coding strategy is very similar to SPEAK except it utilizes higher rates of stimulus, as does CIS, than with the low rate SPEAK strategy (Rubinstein, 2004; Wilson and Dorman, 2008). It was designed to include the benefits of SPEAK with a high rate CIS. ACE provides for the transmission of more information to the auditory nerve compared to SPEAK. Pulse rates of 500 to 3500 pulses per second, and a maxima range from 1 to 20 electrodes stimulated simultaneously can be achieved (Flint, 2010).
The auditory sensation that is used to order sound along a scale from quiet to loud is defined as loudness. It is a subjective sensation, which correlates with sound intensity. Loudness is a subjective measure and changes with frequency. The neurophysiological correlates for loudness are the rate of action potentials at a nerve fiber and the recruitment, the number of neurons, which are exited at the same time. The ears of a normal hearing subject can cover a120 dB range of sound levels. Hearing impairment decreases this range drastically and in cochlear implant users the range is typically less than 20 dB.
The decrease of the range over which loudness can be coded can be attributed to two facts, the loss of the loss spontaneously active fibers, and the all-or nothing recruitment of auditory nerve fibers in the current field during electrical stimulation. Auditory nerve fibers with low spontaneous activity require higher sound levels for stimulation. Combined with the auditory nerve fibers that respond to soft sounds the entire population of fibers can encode the 120 dB range in sound levels. Loss of a population of nerve fibers or the synchronous discharge of neurons at the same time limits the dynamic range of artificial stimulation.
Pulse repetition rates that are faster than the recovery of an auditory neuron after an action potential occurred result in more stochastic activity of the nerve. Similar stochastic neural activity can be seen at the threshold of stimulation. The latter point is important for this patent because the novel coding strategy allows the reduction of the current such that stochastic firing pattern occurs. The rate increase is not achieved by the increase in the current amplitude, but by the pulse generator.
Biophysical properties of the cochlea and its solutions determine the current spread during electrical stimulation. For monopolar stimulation, one of the electrodes is placed in the cochlea and the reference electrode is located outside the cochlea, interaction occurs not only for close neighboring electrodes. The current spreads for about 3 mm along the cochlea (the equivalent of about 3 electrode contacts). Multi-polar stimulation paradigms may offer some opportunities to focus the current field to the target structures or to stimulate at areas between two electrode contacts. The price for the selectivity is an increase in power consumption and the simultaneous use of multiple electrode contacts. It is not surprising that multipolar stimulation did not result in drastic improvements in patient performance.
To avoid interactions between neighboring electrodes, contemporary coding strategies use interleaved stimulation paradigms. Amplitude-modulated trains of electrical pulses at repetition rates at about, and well above 300 Hz, are used to encode the acoustical information. It has been argued that high repetition rates, which are well above 300 Hz, better reproduce the fine structure of the auditory signal and that more stochastic activity can be seen that increases the range over which the current level can be changed.
Therefore, a heretofore unaddressed need exists in the art to address the aforementioned deficiencies and inadequacies.