Technical Field
The present invention relates to estimation of sound identification based on periodic indications in the frequency spectrum of an audio signal.
Description of the Related Art
A number of conventional speech recognition systems use features processed by log-Mel or Mel-Frequency Cepstrum Coefficients (MFCC) as input features. Log-Mel and MFCC apply a Mel-filter bank to a frequency spectrum of the audio signal data. However, a Mel-filter bank does not preserve higher resolution information in the audio signal data. Typically, harmonic structures in human speech are lost through a Mel-filtering process. The harmonic structure provides information that may be used to discriminate vowels from other phonemes.
Meanwhile, current speech recognition systems are computationally expensive, and thus require lots of time or many computational resources. There is a need for integrating the harmonic structure into a speech recognition system in a way that may improve performance of the system.