The ability to correctly identify voiced and unvoiced speech is critical to many speech applications including speech recognition, speaker verification, noise suppression, and many others. In a typical acoustic application, speech from a human speaker is captured and transmitted to a receiver in a different location. In the speaker's environment there may exist one or more noise sources that pollute the speech signal, or the signal of interest, with unwanted acoustic noise. This makes it difficult or impossible for the receiver, whether human or machine, to understand the user's speech.
Typical methods for classifying voiced and unvoiced speech have relied mainly on the acoustic content of microphone data, which is plagued by problems with noise and the corresponding uncertainties in signal content. This is especially problematic now with the proliferation of portable communication devices like cellular telephones and personal digital assistants because, in many cases, the quality of service provided by the device depends on the quality of the voice services offered by the device. There are methods known in the art for suppressing the noise present in the speech signals, but these methods demonstrate performance shortcomings that include unusually long computing time, requirements for cumbersome hardware to perform the signal processing, and distorting the signals of interest.
In the figures, the same reference numbers identify identical or substantially similar elements or acts.
Any headings provided herein are for convenience only and do not necessarily affect the scope or meaning of the claimed invention.