The present invention relates to electronic speech recognition and more particularly to a method and apparatus for efficiently performing dynamic time warping and pattern matching to recognize, in sample speech, any one of a plurality of token words.
The problem of electronically recognizing human speech has been intensively investigated and various promising approaches have been identified. However, very few practical or cost effective systems have evolved. Of the systems which have been actually implemented there tend to be two groups. One class of system tends to be relatively inexpensive but also relatively inaccurate if expected to identify a vocabulary of more than a few token words. On the other hand, there are also systems which are reasonably accurate when properly programmed and trained but which are almost prohibitively expensive for practical applications. In general, the high cost of the accurate systems is a consequence of the computationally intensive nature of the algorithms employed and the relatively high cost of the computers employed for performing those computations. While various dedicated array-type processors have been proposed, none as yet has proven particularly cost effective.