The present invention is directed to identification of perceptual features. More particularly, the invention provides a system and method, for such identification, using one or more events related to coincidence between various frequency channels. Merely by way of example, the invention has been applied to phone detection. But it would be recognized that the invention has a much broader range of applicability.
After many years of work, a basic understanding of speech robustness to masking noise often remains a mystery. Specifically, it is usually unclear how to correlate the confusion patterns with the audible speech information in order to explain normal hearing listeners confusions and identify the spectro-temporal nature of the perceptual features. For example, the confusion patterns are speech sounds (such as Consonant-Vowel, CV) confusions vs. signal-to-noise ratio (SNR). Certain conventional technology can characterize invariant cues by reducing the amount of information available to the ear by synthesizing simplified CVs based only on a short noise burst followed by artificial formant transitions. However, often, no information can be provided about the robustness of the speech samples to masking noise, nor the importance of the synthesized features relative to other cues present in natural speech. But a reliable theory of speech perception is important in order to identify perceptual features. Such identification can be used for developing new hearing aids and cochlear implants and new techniques of speech recognition.
Hence it is highly desirable to improve techniques for identifying perceptual features.