1. Field of the Invention
The present invention relates to a method and a system for accurately recognizing speech, and more particularly to a speech recognition method and system having excellent recognition accuracy even under noisy environments.
2. Description of the Related Art
Many research efforts as to speech recognition have been made now to improve system functions as well as to effectively enter any signal into various types of information equipment and communication equipment. A method of pattern matching is known in the art as an ordinary method to effect speech recognition.
A prior method of speech recognition will be described below with reference to FIG. 1.
An input speech (S1) signal is converted into a time series pattern (hereinafter referred to as a speech pattern) of vectors indicative of features of the frequency analysis (S2) thereof. The speech pattern is yielded by sampling intra-band frequency components for every time interval T (8 milliseconds, for example, hereinafter referred to as a speech frame), the intra-band frequency component being extracted through a group of P band-pass filters having different center frequencies. In addition, speech power of the speech pattern for each speech frame interval is also evaluated in this frequency analysis (S2). The speech pattern yielded (extracted) through the frequency analysis (S2) is stored in succession in an input memory (S5) during the succeeding speech pattern storage processing (S3). Meanwhile, a voiced interval, i.e., a start point and an end point of speech are determined based on the speech power evaluated through the frequency analysis (S2) in the speech interval detection processing (S4). As an algorithm for determining a voiced interval using the speech power, there is known, for example, a simple algorithm taking as a start point of speech a time point of the speech power at more than a certain threshold and as an end point of the speech a time point of the speech power at less than the threshold. There are known other general algorithms as well. The speech pattern within the voiced interval determined through the speech interval detection processing (S4) is read from the input memory (S5), while a reference pattern is read from a reference pattern memory (S6). Then, in the similarity evaluation processing (S7), the similarity between the speech pattern and the reference pattern is estimated by making use of a dynamic programming matching method and a linear matching method, for example. The reference pattern described here is a time series pattern of vectors subjected to the same speech analysis as in the speech pattern with respect to a word being a recognition object, and is previously stored in the reference pattern memory (S6). In the subsequent judgement processing (S8), the similarity between each reference pattern evaluated by the similarity evaluation processing (S7) is compared, and a name given to a reference pattern indicative of a maximum similarity is determined as a recognition result (S9). The prior speech recognition method described above was adapted to estimate a difference between the speech pattern indicative of a spectrum of the speech signal and the reference pattern previously evaluated by the same spectral analysis using the similarlity described above, and thereby adopt a name of the reference pattern showing the maximum similarity as a recognition result. Accordingly, when input speech and reference patterns are the same word, the similarity therebetween is increased, but when they are different the similarity is reduced. If, however, the spectrum of a speech pattern is distorted due to factors other than the speech, for example external noises, similarity between a speech pattern and a reference pattern is reduced even if both belong to the same category, and hence it is impossible to yield a correct recognition result. Furthermore, such a prior recognition method requires much time for the arithmetic operations and a memory storage, and is thus likely to result in a large-size structure device for implementation.