Most sounds are transmitted in a normal ear as shown in FIG. 1 through the outer ear 101 to the tympanic membrane (eardrum) 102, which moves the bones of the middle ear 103 (malleus, incus, and stapes) that vibrate the oval window and round window openings of the cochlea 104. The cochlea 104 is a long narrow duct wound spirally about its axis for approximately two and a half turns. It includes an upper channel known as the scala vestibuli and a lower channel known as the scala tympani, which are connected by the cochlear duct. The cochlea 104 forms an upright spiraling cone with a center called the modiolus where the spiral ganglion cells of the acoustic nerve 113 reside. In response to received sounds transmitted by the middle ear 103, the fluid-filled cochlea 104 functions as a transducer to generate electric pulses which are transmitted to the cochlear nerve 113, and ultimately to the brain.
Hearing is impaired when there are problems in the ability to transduce external sounds into meaningful action potentials along the neural substrate of the cochlea 104. To improve impaired hearing, auditory prostheses have been developed. For example, when the impairment is associated with the cochlea 104, a cochlear implant with an implanted stimulation electrode can electrically stimulate auditory nerve tissue with small currents delivered by multiple electrode contacts distributed along the electrode.
In some cases, hearing impairment can be addressed by a cochlear implant (CI), a brainstem-, midbrain- or cortical implant that electrically stimulates auditory nerve tissue with small currents delivered by multiple electrode contacts distributed along an implant electrode. For cochlear implants, the electrode array is inserted into the cochlea. For brainstem, midbrain and cortical implants, the electrode array is located in the auditory brainstem, midbrain or cortex, respectively.
FIG. 1 shows some components of a typical cochlear implant system where an external microphone provides an audio signal input to an external signal processor 111 which implements one of various known signal processing schemes. For example, signal processing approaches that are well-known in the field of cochlear implants include continuous interleaved sampling (CIS) digital signal processing, channel specific sampling sequences (CSSS) digital signal processing (as described in U.S. Pat. No. 6,348,070, incorporated herein by reference), spectral peak (SPEAK) digital signal processing, fine structure processing (FSP) and compressed analog (CA) signal processing.
The processed signal is converted by the external signal processor 111 into a digital data format, such as a sequence of data frames, for transmission by an external coil 107 into a receiving stimulator processor 108. Besides extracting the audio information, the receiver processor in the stimulator processor 108 may perform additional signal processing such as error correction, pulse formation, etc., and produces a stimulation pattern (based on the extracted audio information) that is sent through electrode lead 109 to an implanted electrode array 110. Typically, the electrode array 110 includes multiple stimulation contacts 112 on its surface that provide selective electrical stimulation of the cochlea 104.
Generally, there is a need to obtain data from the implanted components of a cochlear implant. Such data collection enables detection and confirmation of the normal operation of the device, and allows stimulation parameters to be optimized to suit the needs of individual recipients. This includes data relating to the response of the auditory nerve to stimulation, which is of particular relevance to the present invention. Thus, regardless of the particular configuration, cochlear implants generally have the capability to communicate with an external device such as for program upgrades and/or implant interrogation, and to read and/or alter the operating parameters of the device.
Typically, following the surgical implantation of a cochlear implant, the implant is fitted or customized to conform to the specific recipient demands. This involves the collection and determination of patient-specific parameters such as threshold levels (T levels) and maximum comfort levels (C levels) for each stimulation channel. Essentially, the procedure is performed manually by applying stimulation pulses for each channel and receiving an indication from the implant recipient as to the level and comfort of the resulting sound. For implants with a large number of channels for stimulation, this process is quite time consuming and rather subjective as it relies heavily on the recipient's Subjective impression of the stimulation rather than any objective measurement.
This approach is further limited in the case of children and prelingually or congenitally deaf patients who are unable to supply an accurate impression of the resultant hearing sensation, and hence fitting of the implant may be suboptimal. In such cases an incorrectly-fitted cochlear implant may result in the recipient not receiving optimum benefit from the implant, and in the cases of children, may directly hamper the speech and hearing development of the child. Therefore, there is a need to obtain objective measurements of patient-specific data, especially in cases where an accurate subjective measurement is not possible.
One proposed method of interrogating the performance of an implanted cochlear implant and making objective measurements of patient-specific data such as T and C levels is to directly measure the response of the auditory nerve to an electrical stimulus. To collect information about the electrode-nerve interface, a commonly used objective measurement is based on the measurement of Neural Action Potentials (NAPs) such as the electrically-evoked Compound Action Potential (eCAP), as described by Gantz et al., Intraoperative Measures of Electrically Evoked Auditory Nerve Compound Action Potentials, American Journal of Otology 15 (2):137-144 (1994), which is incorporated herein by reference. In this approach, the recording electrode is usually placed at the scala tympani of the inner ear. The overall response of the auditory nerve to an electrical stimulus is measured typically very close to the position of the nerve excitation. This neural response is caused by the super-position of single neural responses at the outside of the auditory nerve membranes. The response is characterized by the amplitude between the minimum voltage (this peak is called typically N1) and the maximum voltage (peak is called typically P2). The amplitude of the eCAP at the measurement position is between 10 μV and 1800 μV. One eCAP recording paradigm is a so-called “amplitude growth function,” as described by Brown et al., Electrically Evoked Whole Nerve Action Potentials In Ineraid Cochlear Implant Users: Responses To Different Stimulating Electrode Configurations And Comparison To Psychophysical Responses, Journal of Speech and Hearing Research, vol. 39:453-467 (June 1996), which is incorporated herein by reference. This function is the relation between the amplitude of the stimulation pulse and the peak-to-peak voltage of the eCAP. Another clinically used recording paradigm is the so called “recovery function” in which stimulation is achieved with two pulses with varying interpulse intervals. The recovery function as the relation of the amplitude of the second eCAP and the interpulse interval allows conclusions to be drawn about the refractory properties and particular properties concerning the time resolution of the auditory nerve.
Detecting NAPs such as eCAPs is based on an analysis of an obtained measurement recording (R) which can be understood as a signal mixture containing the desired NAPs (A), artifacts due to the stimulation (B) and other sources (C) and noise (D). A linear model of this signal mixture is:R=A+B+C+D 
State-of-the-art NAP measurement systems apply special recording sequences to reduce the unwanted artifacts and the noise present during the measurement. The stimulation artifact (B) is partially removed from the recording (R) by different measurement paradigms such as “alternating stimulation” (Eisen M D, Franck K H: “Electrically Evoked Compound Action Potential Amplitude Growth Functions and HiResolution Programming Levels in Pediatric CII Implant Subjects.” Ear & Hearing 2004, 25 (6):528-538; which is incorporated herein by reference in its entirety), “masker probe” (Brown C, Abbas P, Gantz B: “Electrically evoked whole-nerve action potentials: data from human cochlear implant users.” The Journal of the Acoustical Society of America 1990, 88 (3):1385-1391; Miller C A, Abbas P J, Brown C J: An improved method of reducing stimulus artifact in the electrically evoked whole-nerve potential. Ear & Hearing 2000, 21 (4):280-290; both of which are incorporated herein by reference in their entireties), “tri-phasic stimulation” (Zimmerling M: “Messung des elektrisch evozierten Summenaktionspotentials des Hörnervs bei Patienten mit einem Cochlea-Implantat.” In PhD thesis Universität Innsbruck, Institut für Angewandte Physik; 1999; Schoesser H, Zierhofer C, Hochmair E S. “Measuring electrically evoked compound action potentials using triphasic pulses for the reduction of the residual stimulation artefact,” In: Conference on implantable auditory prostheses; 2001; both of which are incorporated herein by reference in their entireties), and “scaled template” (Miller C A, Abbas P J, Rubinstein J T, Robinson B, Matsuoka A, Woodworth G: Electrically evoked compound action potentials of guinea pig and cat: responses to monopolar, monophasic stimulation. Hearing Research 1998, 119 (1-2):142-154; which is incorporated herein by reference in its entirety). Artifacts due to other sources (C) are partially removed by a “zero amplitude template” (Brown et al. 2000). The noise (D) is reduced by repeated measurements, averaging over the repeated recordings reduces the noise level by √N for N repetitions.
These special recording sequences result in a processed recording (R′) with a reduced noise floor (D′) and remaining artifacts (B′ and C′) which in most cases are reduced in amplitude. Some recording sequences also result in an altered NAP response (A′), for example the “masker probe” paradigm (Westen, A. A.; Dekker, D. M. T.; Briaire, J. J. & Frijns, J. H. M. “Stimulus level effects on neural excitation and eCAP amplitude.” Hear Res, 2011, 280, 166-176; which is incorporated herein by reference in its entirety).
To automatically detect a NAP response in the resulting recording (R′) one commonly used technique is known as “template matching” (SmartNRT as used by Advanced Bionics; Arnold, L. & Boyle, P. “SmartNRI: algorithm and mathematical basis.” Proceedings of 8th EFAS Congress/10th Congress of the German Society of Audiology, 2007; which is incorporated herein by reference in its entirety). First an additional de-noising of the recording (R′) is performed by calculating correlations with basis functions predefined by a principal component analysis and performing weighted summation, resulting in a recording (R″) with reduced noise (see U.S. Pat. No. 7,447,549; which is incorporated herein by reference in its entirety). Then an artifact model (BModel+CModel) representing the sum of two decaying exponentials is fitted to this post-processed recording (R″) and with a strength of response metric (SOR=(R″−BModel−CModel)/noise) a threshold is determined to detect a possible NAP (A) (U.S. Pat. No. 7,818,052; which is incorporated herein by reference in its entirety).
Another approach to automatically detect a NAP response in the resulting recording (R′) is known as expert system (AutoNRT™ as used by Cochlear Ltd.; Botros, A.; van Dijk, B. & Killian, M. “AutoNRT™: An automated system that measures ECAP thresholds with the Nucleus® Freedom™ cochlear implant via machine intelligence” Artificial Intelligence in Medicine, 2007, 40, 15-28; which is incorporated herein by reference in its entirety). The expert system used is a combination of a template matching and a decision tree classifier (U.S. Patent Publication US 20080319508 A1; which is incorporated herein by reference in its entirety). The template matching classifier computes the correlation with a NAP (A) template and a NAP plus stimulation artifact (A+B) template. The decision tree uses the following six parameters:                N1-P1 amplitude for NAP typically latencies        noise level        ratio N1-P1 amplitude to noise level        correlation with NAP (A) template        correlation with NAP plus stimulation artifact (A+B) template        correlation between this measurement (R) and a previous measurement at a lower stimulation amplitude.Two different decision tree classifiers were learned with a C5.0 decision tree algorithm. For the case where no NAP (A) was detected at lower stimulation levels, the stimulation level was increased and a decision tree with a low false positive rate was used to determine the presence of a NAP (A). For the case where a NAP (A) was detected, the stimulation level was reduced and a decision tree with a low overall error rate was used to evaluate the presence of a NAP (A).        
An established working hypothesis is that neurosensory systems are performing a highly optimized signal analysis using a sparse representation (see for example B. Olshausen and D. Field, “Sparse coding of sensory inputs,” Curr Opin Neurobiol, vol. 14, no. 4, pp. 481-487, 2004, incorporated herein by reference in its entirety). Such a signal model is important in the context of analysis, estimation and automatic detection of a signal. The earliest theoretical signal analysis model, proposed by Fourier (J. B. J. Fourier, Théorie analytique de la chaleur (The Analytical Theory of Heat). Paris: F. Didot, 1822, incorporated herein by reference in its entirety), analyzes the frequency content of a signal using the expansion of functions into a weighted sum of sinusoids. Gabor (D. Gabor, “Theory of communications,” Journal of Institute of Electrical Engineers, vol. 93, no. III-26, pp. 429-457, 1946, incorporated herein by reference in its entirety) extended this signal model by using shifted and modulated time-frequency atoms which analyze the signal in the frequency as well as in the time dimension. The wavelet signal model, a further improvement presented by Morlet et al. (J. Morlet, G. Arens, I. Fourgeau, and D. Giard, “Wave propagation and sampling theory,” Geophysics, vol. 47, no. 2, pp. 203-236, 1982, incorporated herein by reference in its entirety), uses time-frequency atoms that are scaled dependent on their center frequency. This yields an analysis of the time-frequency plane with a non-uniform tiling. However, the time-frequency atoms used in these signal models normally do not assume an underlying signal structure. As the performance of subsequent detection algorithms depends strongly on how well the fundamental features of a signal are captured, it is favorable to use time-frequency atoms that are specialized to the applied signal class and inherently exhibit the property of a sparse representation. To derive such a data dependent sparse signal-model, several algorithms have been proposed, for example, but not limited to: MOD (K. Engan, S. O. Aase, and J. H. Husøy, “Method of optimal directions for frame design”, Proc. ICASSP, Vol. 5, pp. 2443-2446, 1999, incorporated herein by reference in its entirety) or K-SVD (U.S. Pat. No. 8,165,215, incorporated herein by reference in its entirety).