The present invention relates to speaker recognition techniques using glottal pulse shapes.
It is known to use glottal pulse-shapes extracted from talker's voice as a unique key to identify the voice pattern of individuals. With a conventional speaker recognition system for multiple designated speakers, the beginning and end points of an utterance are detected by an endpoint detection unit. This is usually done by detecting the power level of the voice input and identifying it as a voice signal if it exceeds a prescribed threshold. Once the beginning and end of the utterance has been found, a series of measurements are made to provide feature parameters such as cepstrum coefficients, linear prediction coefficients and/or auto-correlation coefficients. While these parameters contain articulation information which is relevant to speaker identity, they further contain speaker independent information such as phonemes. The conventional speaker recognition system enhances its reliability by additionally employing speaker identifying keywords, or digitized spoken words or phrases, stored in a memory. In response to a voice input, the memory is searched to detect a keyword which is combined with a stored voice pattern and to detect a match with the extracted feature using a dynamic programming technique by which the time scale of a reference utterance is dynamically warped so that significant events of the input utterance line up with the corresponding significant events in the reference utterance.