Sounds can be described as the sum of band-limited signals, each of which corresponding to the product of an envelope (the slow amplitude fluctuations) and a fine structure (the rapid fluctuations in amplitude close to the center frequency of the signal). In everyday life, our acoustic environment is generally composed of more than one sound, each produced by an independent source. Processing the information corresponding to a particular source often requires isolating one sound from the mixture of sounds. Further, full analysis of the auditory scene involves monitoring and awareness of the multitude of sound sources in the environment. The auditory system of a human with normal hearing function is reasonably effective in extracting a sound from a mixture. For instance, when several persons are talking simultaneously, the auditory system is able to “tune in” to a single voice and “tune out” all others. The auditory system of a human with normal hearing function is also reasonably effective at maintaining an awareness of multiple sound sources and switching attention between these sources, should that become necessary. Studies suggest that temporal fine structure (TFS) cues play an important role in extracting the desired audio signal from a mixture of sounds, especially when the background is fluctuating in frequency and/or time.
There are currently a number of prosthetic devices, such as cochlear implants, that seek to restore hearing in the profoundly deaf by stimulating the auditory nervous system via electrodes inserted into the auditory system. Most cochlear-implant users have great difficulties understanding speech in noise. Complicating the issue is the fact that cochlear-implant processors replace the temporal fine structure (or carrier) of the incoming sounds with a single pulse train, limiting the availability of temporal fine structure cues to the auditory system to segregate sound sources.
To circumvent this limitation, conventional cochlear-implant processors may attempt to suppress all but one sound source (the desired or “target” speech signal), thereby allowing users to process at least one signal effectively. There are several drawbacks to this approach. First, this approach assumes that the noise reduction system knows which signal is the target signal. If the user wishes to listen to an audio signal other than the one that the system selected as the target, the user would be unable to do so. Furthermore, this approach may have limited effectiveness in situations where the acoustic environment is less than ideal (i.e., situations in which the target signal is not easy to identify).
An alternative approach currently employed by cochlear implants is to convey all sounds in the environment, but to convey all these sounds on a single carrier. This carrier often consists of a pulse train having a single pulse rate. Such approach generally results in poor speech intelligibility.
Another limitation of conventional cochlear implants is that most are designed primarily to extract and transmit temporal envelope information to the user, discarding TFS information. As noted above, however, TFS information has been shown to play a significant role in extracting an audio signal from among other signals.
There have been a few attempts to provide the original TFS from the target speech to cochlear-implant users. Providing the original fine structure to cochlear-implant users, however, is technically challenging and as a result, most approaches transmit fine structure related cues in only limited fashion. Although these approaches may provide some benefit, the improvement in speech recognition remains limited.
The presently disclosed systems and methods for multi-carrier processing for auditory prosthetic devices are directed toward overcoming one or more of the problems set forth above and/or other problems in the art.