1. Field of the Invention
The present invention relates to an apparatus, method, and medium for detecting a voiced sound and an unvoiced sound, and more particularly, to an apparatus, method, and medium for detecting a voiced sound zone and an unvoiced sound zone using a spectral flatness measure (SFM) and a slope of a mel-scaled filter bank spectrum obtained from a voice signal in a predetermined zone.
2. Description of the Related Art
Various encoding methods that perform signal compression using statistical attributes and human auditory characteristics of a voice signal in a time domain or frequency domain have been suggested. To encode a voice signal, information determining whether the input voice signal is a voiced sound or an unvoiced sound is typically used. A method of detecting a voiced sound and an unvoiced sound from an input voice signal can be divided into a method performed in the time domain and a method performed in the frequency domain. The method performed in the time domain complexly uses at least one of a frame average energy of a voice signal and a zero-cross rate, and the method performed in the frequency domain uses information on low frequency and high frequency components of the voice signal or pitch harmonic information. If the conventional methods described above are used in a clean environment, satisfactory detection performance can be guaranteed. However, if the conventional methods described above are used in a white noise environment, the detection performance is considerably deteriorated.