In self-service or interactive voice response (IVR) channels, it would be valuable to improve security and reduce fraud by using voice biometrics to help authenticate users. This is typically done by the use of “active authentication”: prompting the user for a passphrase, and comparing it to a voice biometric model of the person. An authentication is typically performed based on an identified user (or the alleged identification of a user). For example, a user may be authenticated by determining or accepting an alleged identification for a user, determining authentication information corresponding to the user (e.g., password, biometric information), and determining whether or not a match exists between input and stored authentication information. A passphrase, as described herein, may be for example a sequence of words or other text used for access control which is similar to a password in usage, but is generally longer for added security.
One example way to perform this with high accuracy is to enroll a customer during one interaction, by prompting the customer to say or vocalize a specific passphrase one or more times, recording it, and creating a model of the user saying this phrase. During a future interaction, the user is prompted to say this same passphrase, which is compared to the enrollment, and gives a score as to whether or not the user is the same as the user who provided the passphrase during enrollment.
There are two problems with the above scenario. First, customers are not motivated to go out of their way to go through the enrollment process. Second, passphrases cannot change without a new enrollment phase.
An alternate process is to use a text-independent voice biometric model that created passively from observed speech in a historical call, so that the enrollment stage is skipped. As understood herein, text-independent enrollment refers to situations in which no defined script or predefined choice of text or words is required for a voice biometric model to be generated, e.g., generation of the voice biometric model is not dependent on the speaker actively articulating specific predefined text, as would be the case for a text-dependent voice biometric model. Both problems are solved, as explicit enrollment is no longer necessary, plus the model can now be used against any phrase. Using this process, passphrases can be changed at any time, which of itself is a large gain to security, as pre-recording a passphrase is more difficult, since a fraudster does not know until the time of the call what the passphrase will be.
This alternate process, however, has a significant drawback of its own: accuracies of text-independent systems on very short segments of speech are lower than text-dependent systems in which specific predefined text is verbalized, spoken, articulated, etc.
What is needed therefore is a system and method that addresses the loss of accuracy when using a text-independent voice biometric system in an active authentication scenario.