Speech recognizing equipment for acquiring a speech as a data to process the data and recognize speech meanings has been developed for practical use. For example, a speech recognizing engine is incorporated in computer systems for inputting speech syllables into a word-processor and in vehicle-mounted navigation systems receiving a series of speech utterance.
Generally, in a speech recognition system, a method is employed for extracting from an inputted speech signal a few parameters (referred to as a feature quantity of the speech) which characterize the feature of speech recognition and then for comparing those parameters with typical feature quantities having been registered in advance, so that the most similar speech syllable is selected from a set of typical syllables as a recognition result. In such a method, a role to be played by the feature quantity of speech is important, so that many researches are being continued to search feature quantities of speech in order to improve a speech recognition rate using a smaller number of parameters.
Well known as a representative feature vector to be used for a speech recognition are a power spectrum which can be obtained through a band pass filter or the Fourier transform, and cepstrum coefficients which can be obtained by the inverse Fourier transform and an LPC (Linear Prediction Coefficients) analysis. A time sequence of the feature vector extracted from a set of speech features is used in a pattern matching algorithm for a subsequent recognition process (see Patent Document 1, for example).
In such a method, however, an arithmetic operation performed to extract a feature quantity becomes extremely complex in a pattern-matching algorithm, resultantly requiring a certain amount of computation time. As a result, it is difficult to develop a real-time speech recognition system, because time consumption for feature extraction of speech is large.
On the other hand, in a trend that a cellular phone has gained widespread use and a mobile digital assistant is getting miniaturized, more attention is being paid to a speech-recognition technology as a man-machine interface for such equipment. In recent years, research and development are being made actively for a natural speech recognition which may not require a constraint on a speaker's speech-mode, as can be seen in a retrieval engine using key-words and a hidden Markov model (HMM) which are employed in continuous speech-recognition systems.
On the other hand, in a hearing aid for the deaf and a loudspeaker for reproducing a high quality voice and/or music, attention is paid to a technique which processes a speech signal so as to provide a clearly audible sound. In an application to such mobile equipment, a speech-recognition technology is confined to a simple algorithm with high accuracy. However, a hearing aid has not yet been developed in which an arithmetic operation for a speech-recognition system is simplified at a level of practical use.
Without being limited to a speech-recognition system alone, the Fourier transform is generally used for analytically processing a signal waveform and often for obtaining frequency spectra, etc. However, in order to apply the Fourier transform or the inverse Fourier transform, an arithmetic algorithm is so complex as to make a total amount of computation time too large, requiring a large processing capacity. Therefore, the hardware for signal processing by using the Fourier transform is complicated and expensive.
Patent Document 1: Japanese Patent Laid-Open Publication No. 2003-271190