1. Field of the Invention
The present invention relates to verifying that a sequence of operations has been carried out by a specific entity. Usually the entity is a person and the sequence of operations may, for example, be speaking a digit or a letter, or writing a letter or a word. Thus the invention relates particularly to the verification that an utterance was made by a predetermined person, but it is believed that the invention can be applied to other actions carried out by persons such as recognising written words.
2. Description of the Related Art
Speech recognition using Hidden Markov Models (HMM) is a well known technique and HMMs have also been applied to signature verification. Speaker verification is important in many applications, particularly where financial transactions are to be carried out automatically by telephone and where access to premises is to be controlled. False acceptances of speech are likely to cause serious problems when unauthorised transactions or access are allowed. Almost as important are false rejections where a person who should be verified is not. False rejections cause annoyance especially when they occur frequently.
Speech verification over telephone links raises its own problems due to the limited bandwidth of such links and distortion which often occurs.
Three previous examples of speaker verification systems are described in U.S. Pat. Nos. 4,363,102, 4,694,493 and 4,910,782.
According to a first aspect of the present invention there is provided a method of verifying that a sequence of operations originates from a specific entity, comprising the steps of
extracting a test sequence of sets of features of the results of the operations, one set corresponding to each operation,
matching the said test sequence of sets of features against a first stored probabilistic finite state machine model derived from sets of features of the results of the same sequence of operations when originated by a plurality of entities,
matching the said test sequence of features against a second stored finite state machine model derived from sets of features of the results of the same sequence of operations when originated by the specific entity, and
comparing the results of the matching steps to indicate whether the test sequence of operations originated from the specific entity.
According to a second aspect of the invention there is provided apparatus for verifying that a sequence of operations originated from a specific entity, comprising
means for storing data specifying first and second finite state machines, the data for the first machine having been derived from sets of features of the results of a sequence of operations originated by a plurality of entities, the data for the second machine having been derived from sets of features of the results of the same sequence of operations originated by a specific entity,
means for extracting a test sequence of sets of features from the results of a sequence of operations which are alleged to have been originated by the said specific entity,
means for matching the said test sequence against the first and second said machines, respectively, and
means for comparing results from the matching means to indicate whether the test sequence was originated by the said specific entity.
The specific entity is usually a person, although the entity may be an object, for example an object undergoing non-destructive testing when the sequence of operations may be signals originated by the object under test. As has been mentioned, where the entity is a person the sequence of operations may, for example, be the utterance of a sound or the signing of a signature. The sounds may be alpha-numeric characters or words and the characters or words may be uttered as isolated items, or connected items as in continuous speech.
The invention has the advantage that it tends to reduce false acceptances and false rejections in speaker verification.
Signals resulting from incoming speech may be digitized at relatively short intervals and processed over relatively long intervals to provide sets or "frames" of digital signals derived from spectral components. By rejecting some of these components before or after further processing, the effects of telephone link limitations and distortion can be reduced so that speaker verification over telephone systems is possible.
According to a third aspect of the invention, therefore, there is provided a method of speech verification or recognition including
obtaining digital signals representative of speech,
carrying out cepstral processing of the digital signals, and
carrying out speech verification or recognition based on cepstral coefficients resulting from the processing but omitting the zero and/or first of the coefficients.
By using a gradient algorithm the finite state machine models employed by the invention, usually HMMs, may be refined when an appropriate method of finding a suitable partial differential is known. Such a method is described below.
Thus, according to a fourth aspect of the present invention there is provided a method of modifying Hidden Markov Models using a gradient based algorithm. Preferably a number of iterations are carried out, and after each iteration the modified models are tested against stored data to determine whether improvements have taken place, the processes finishing when improvements become insignificant. The invention also includes apparatus for carrying out the third and fourth aspects of the invention.