The present invention generally relates to speech recognition apparatuses, and more particularly to a speech recognition apparatus which makes a pitch extraction using a filter bank.
There is a proposed binary time spectrum pattern (BTSP) speech recognition system which carries out a linear matching between dictionary patterns and an input pattern which is obtained by subjecting a speech made in units of words to a binarization process. This proposed BTSP speech recognition system only requires a simple process because no dynamic programming (DP) matching is required. For this reason, the frequency deviation on the TSP can be absorbed satisfactorily, and is applicable to unspecified speakers.
On the other hand, a speech recognition system which uses a speech recognition dictionary and a speech synthesis dictionary in common is proposed in a Japanese Laid-Open Patent Application No. 63-502146, for example. However, according to this speech recognition system, there is a problem in that the synthesized speech does not have intonation or accent and sounds unnatural because the speech is generated with a constant pitch. Furthermore, when the BTSP is used for the speech recognition dictionary, there is another problem in that the volume (power) of the speech lacks smoothness.