A normal ear transmits sounds as shown in FIG. 1 through the outer ear 101 to the tympanic membrane (eardrum) 102, which moves the bones of the middle ear 103, which in turn vibrate the oval window and round window openings of the cochlea 104. The cochlea 104 is a long narrow duct wound spirally about its axis for approximately two and a half turns. The cochlea 104 includes an upper channel known as the scala vestibuli and a lower channel known as the scala tympani, which are connected by the cochlear duct. The scala tympani forms an upright spiraling cone with a center called the modiolar bone where the spiral ganglion cells of the acoustic nerve 113 reside. In response to received sounds transmitted by the middle ear 103, the fluid filled cochlea 104 functions as a transducer to generate electric pulses that are transmitted by the cochlear nerve 113 to the brain. Hearing is impaired when there are problems in the ability to transduce external sounds into meaningful action potentials along the neural substrate of the cochlea 104.
In some cases, hearing impairment can be addressed by a cochlear implant (CI), a brainstem-, midbrain- or cortical implant that electrically stimulates auditory nerve tissue with small currents delivered by multiple electrode contacts distributed along an implant electrode. For cochlear implants, the electrode array is inserted into the cochlea. For brainstem, midbrain and cortical implants, the electrode array is located in the auditory brainstem, midbrain or cortex, respectively. FIG. 1 shows some components of a typical cochlear implant system where an external microphone provides an audio signal input to an external signal processor 111 which implements one of various known signal processing schemes. For example, signal processing approaches that are well-known in the field of cochlear implants include continuous interleaved sampling (CIS) digital signal processing, channel specific sampling sequences (CSSS) digital signal processing (as described in U.S. Pat. No. 6,348,070, incorporated herein by reference), spectral peak (SPEAK) digital signal processing, fine structure processing (FSP) and compressed analog (CA) signal processing. The processed signal is converted by the external signal processor 111 into a digital data format, such as a sequence of data frames, for transmission by an external coil 107 into a receiving stimulator processor 108. Besides extracting the audio information, the receiver processor in the stimulator processor 108 may perform additional signal processing such as error correction, pulse formation, etc., and produces a stimulation pattern (based on the extracted audio information) that is sent through electrode lead 109 to an implanted electrode array 110. Typically, the electrode array 110 includes multiple stimulation contacts on its surface that provide selective electrical stimulation of the cochlea 104.
Improving coding strategies for cochlear implants requires speech perception tests with large numbers of patients, which are very time demanding and depend to a large extent on the individuals. If changes involve new hardware features of the implant, these tests are not possible before the new devices are implanted. Performance improvements are difficult to prove, they require subjective speech tests with large numbers of cochlear implant patients.
The coding strategies for generating the data signals for cochlear implant systems long neglected any fine-grained temporal information, as did automatic speech recognition (ASR) systems. This was because of the notion that the spectral envelopes of the stimulation pulses code all relevant information required for speech understanding. In addition, there simply were not recognized techniques to extract temporal speech features.
The development of coding strategies for cochlear implants mainly relies on an educated guess as to how to improve temporal features of cochlear implant coding strategies and time consuming tests on patients. The evaluation of the rate place code only was proposed to evaluate coding strategies in WO 2013/009805. As shown in FIG. 2, a CI signal processor 201 generated stimulation signals for an implanted electrode array. Feature adjustment module 202 adjusted the feature resolution of the CI stimulation signals to produce a corresponding sequence of cochlear stimulation vectors for automatic speech recognition (ASR) processing. ASR vector pre-processor 203 mapped the cochlear stimulation vectors into corresponding vectors for ASR and ASR Engine 204 evaluated those using ASR techniques. But the rate place code exploited in that invention does not preserve information about the temporal fine structure.
In research, several methods to extract temporal fine-structure information have been described. Oded Ghitza, Auditory Nerve Representation as a Front-End for Speech Recognition in a Noisy Environment, Computer Speech & Language, Volume 1, Issue 2, December 1986, Pages 109-130 (incorporated herein by reference) suggested that an ensemble interval histogram might be used for temporal speech feature extraction. Hugh and Campbell L. Searle, Time-Domain Analysis of Auditory-Nerve Fiber Firing Rates, Journal of the Acoustical Society of America, 85(S1):S534, 1989 (incorporated herein by reference) described extracting temporal features using inter-peak interval histograms. Average localized synchronized rate (ALSR) was used by Young and Sachs, Representation of Steady-State Vowels in the Temporal Aspects of the Discharge Patterns of Populations of Auditory-Nerve Fibers, Journal of the Acoustical Society of America, 66(5):1381-1403, 1979 (incorporated herein by reference).