1. Field of the Invention
The present invention relates generally to computing, biometrics, cryptography and digital networking. More particularly, the invention relates to the use of a digital network as a medium for performing vocal-based biometric identification on behalf of another party requesting identity verification, utilizing cryptographic techniques to protect information as it is transferred over the digital network.
2. Discussion of Related Art
With the explosion of the Internet in recent years, more and more companies are hosting web sites that allow a client to log into their accounts via those websites by typing a password entered at a computer terminal. Many of these accounts are banking and financial based. Since a text password can easily be compromised, in order to prevent theft and fraud, alternate means of identification are needed. In general, biometric verification is a useful means of proving claimed identity. However, it can be argued that certain types of biometrics, such as retinal scans and thumbprint scans are only useful if the person to be verified is physically present at the location of the challenging party or entity requesting authentication. Vocal tract based biometric verification is however distinctly different from the static measurement of say a retinal scan or thumbprint scan, in that it is a dynamically produced measurement, whereby the information used to verify identity can be distinctly different each time a proof of identity is required. For example, the challenging party formulates a new random challenge phrase each time authentication is needed. This challenge phrase must then be recited or spoken to the challenging party in exact order by an individual desiring to prove their identity.
Vocal tract based biometric verification generally includes two phases; an enrollment phase and a verification phase. During enrollment, a speech processor running on a computer is used to segment spoken phrases in audio form into feature vectors. Next, these feature vectors are fed into a data classification engine, which produces a unique voiceprint, or model of an individual's voice. During the verification phase, an enrolled individual's voiceprint is loaded into the data classification engine. The individual who desires to be verified is prompted to speak one or more randomly chosen phrases via a Text-to-Speech (TTS) component. These phrases are digitally captured by a microphone attached to a computer, and are first identified for correctness, by providing them as input to an Automatic Speech Recognizer (ASR). The ASR determines if the phrase spoken matches the challenge phrase in terms of human understandability. For example if the challenge phrase is “one two three four”, and the individual speaks “four three two one”, this first test fails. Secondly, the phrases are again fed as input through a speech processor which produces feature vectors. These feature vectors are then fed as input into the data classification engine, which compares the data model produced to the previous voiceprint. Based on certain criteria specific to the verification algorithm, the identity verification is either accepted or rejected.