This invention relates to a pre-processing system for speech recognition in which a speech waveform is received as an input, such as phoneme recognition, speaker verification and speaker identification.
As the speech recognition, there are the phoneme recognition for recognizing monosyllables, the speaker verification for verifying if the speaker is the person himself, the speaker identification for judging who is the speaker, etc. A known system employing such speech recognition is, for example, an information service system in which a computer system and a telephone line network are coupled.
In the information service system, a push-button signal or a speech signal is received as an input, and speech from a speech response unit is delivered as an output. The input speech is transmitted via the telephone line network to the computer system which is a central information service station.
Since, in this manner, the speech input is sent through the telephone line network, it is affected by the transmission characteristic of the network and becomes distorted. Besides, the transmission characteristic of the network is not uniform since it differs depending on transmission paths. Therefore, the speech signals to be received are subjected to various distortions.
Accordingly, in the case where the speech recognition is carried out with the received signal having been thus distorted, the probability of erroneous recognition is high. This can be a serious problem.