The present invention relates to user authentication. More specifically, the present invention relates to a voice based biometric authentication method and apparatus.
With the development of technology, users need to conduct a large amount of communication and internet based activities during work and everyday living. These internet based activities generally require user authentication in order to ensure security of the user's activities.
Password, as a traditional authentication approach, has the defect of being easily cracked, lost, and/or forgotten. To improve security in password authentication, UKey can also be used in a desktop or notebook to ensure security. However, this approach is difficult to apply to a hand-held mobile terminal or call center. More importantly, the foregoing approach has low personal relevance and a person who illegally acquires the above information can easily be regarded as the user himself/herself. Currently, there are many scenarios where it is required to confirm that an operation is made by the user himself/herself. Therefore, in order to enhance personal relevance in user authentication, authentication approaches utilizing biometric features such as fingerprint recognition, iris detection, face recognition, and sound recognition have been widely used and developed.
Speaker verification is a mainstream approach among current biometric information authentication approaches and generally includes two types of voiceprint recognition approaches: text dependent and text independent. Furthermore, voiceprint recognition also generally includes two steps: enrollment and verification. In text dependent voiceprint recognition, the same voice segment that is spoken in enrollment must also be spoken in verification. This approach has a high accuracy rate (i.e. above 99%) and the length of the voice segment employed in enrollment only needs to be several seconds. Text dependent voiceprint recognition can be easily applied and, thus, is a widely used voice authentication approach. However, since what is spoken is always the couple of sentences in the enrollment set and the voice is available to public, the voice is prone to be stolen via recording, which is then used to cheat the authentication system by a manner of playing back.
In text independent voiceprint recognition, what is spoken in verification can be different from that in enrollment. This approach can solve the problem of cheating via recording by providing a dynamic question during authentication. However, its accuracy rate is low (i.e. generally is about 70%). Thus, this approach cannot be fully put into practical use, especially in the user authentication field that requires a high accuracy rate (such as in banking, etc). Furthermore, in enrollment the user is required to speak out content that is at least several tens of seconds in length, which is not very convenient. Thus, in practical use, text independent voiceprint verification is hard to be taken as a standalone detection approach and is generally combined with other biometric feature authentication approaches. Thus, the application scope is limited.
In summary, there are still deficiencies in the prior art. What is urgently needed is a voice verification solution with a high accuracy rate that can prevent cheating via recording.