1. Technical Field
The present invention relates generally to a system and method for providing user authentication and, in particular, to a system and method for providing confidence-based authentication in an incremental access authentication system, wherein a confidence score is periodically computed during a dialog session between user and machine to check the confidence level in the validity of an original identity claim.
2. Description of Related Art
The computing world is evolving towards an era where billions of interconnected pervasive clients will communicate with each other and with powerful information servers. Indeed, this millennium will be characterized by the availability of multiple information devices that make ubiquitous information access an accepted fact of life. Due to the increase in human-machine interaction that will result from the pervasive use of such information devices, users will demand that such interaction be natural and simple as if they were having a conversation with another individual.
One factor in making the human-machine interaction more natural and effective is the ability of the machine to accurately and efficiently verify an identity claim of the user based on speech interactions. Conventional techniques well known to those skilled in the art for authenticating an individual based on his/her speech properties are typically based on a numerical score, derived from comparing a given test speech sample to previously constructed speaker models. The authentication framework of such conventional techniques are based on a binary hypothesis test, where the result of an authentication is a yes/no answer.
By way of example, assume sn denotes a discrete time speech sample sequence provided by a system user seeking access to a conversational system. This speech data, along with the user's speaker model Mi (which is selected based on an identity claim i provided by the user), is processed to verify the identity claim. The identity claim itself must belong to an authorized user. More specifically, a score for speaker i may be computed using a real (R) valued function ρ taking as input sn, Mi, and possibly computed with respect to the background model(s) (as is understood by those skilled in the art) as follows:ρ(sn,Mi)εR.  (1)A verification (authentication) process is then performed via a hypothesis test. For example, given an identity claim i in the above example, the competing hypotheses are:                H0: The speech sample sn was produced by speaker i.        H1: The speech sample sn was produced by a speaker other than i.        
Next, by computing the distribution of scores under the conditions of each hypothesis, the resulting (distribution) functions can be used to determine a decision criterion and predicted error rates. For example, a decision criterion may involve selecting a threshold t in the space of scores and then making the following determination:                If ρ(sn,Mi)>t then accept H0, else accept H1.        
In addition, the predicted error rates may be determined as follows. Assuming d(ρ|H0) and d(ρ|H1) are the probability densities associated with each of the hypotheses, given a threshold t, the probability of false rejection is:∫−∞td(ρ|H0)  (2)and the probability of false acceptance is:∫t+∞d(ρ|H1).  (3)
Authentication techniques that implement the above binary hypothesis test are useful in applications where human-machine interaction is typically short (e.g., a request for specific information such as a bank balance, simple action commands such as starting a voice activated car, etc.) because the authentication process is typically performed once at the beginning of the short dialog session. Indeed, with simple action commands, no further conversation is required. In addition, because of the minimal conversational dialog in theses instances, the system state (or context) does not need to be collected and maintained over the course of an extended interaction.
On the other hand, more sophisticated dialogs, which are typically long in duration, are characterized by the need to store and manage the context and perform actions based on this context. Systems that afford sophisticated conversational dialog should also afford continual and unobtrusive authentication. By way of example, if the system is being used by a speaker who was initially authenticated, and then suddenly the speaker changes, the system should prevent the new speaker from being able to access the same privileges as the prior speaker. This is particularly important in complex conversational systems that afford access to data with a wide range of security classifications. Indeed, the user's identity should be maintained as part of the system state (context), whereby a change in identity of the speaker is a state change that is detected.
Accordingly, a new authentication process is needed for implementation with a conversational system having sophisticated dialogs so as to provide continuous and unobtrusive authentication of the user during the course of the user interaction with the conversational system.