Voice authentication technology is widely used in the systems related to information security. Usually, these systems have certain restrictions on accessing users, namely, the accessing users are required to be authenticated when accessing the systems. Along with the emergence of the voice authentication technology, when users make a purchase with a credit card, or access a protected computer system, or retrieve transaction information from the bank, they may have their identities authenticated through their voices, namely, they may have their voices inputted by a microphone or telephone and identified by a voice authentication system to verify whether they are who they claim to be. Moreover, for those users with little computer knowledge, such voice authentication based system is easy to use.
Usually, to carry out the voice authentication, the voice authentication system needs to capture the voice of the speaker, digitize it, and compare it with the stored voice characteristics. Generally, a voice authentication system mainly comprises: a voice input device, such as microphone, telephone, etc; an analog-to-digital converter to digitize the inputted voice; a high-performance computer to perform voice authentication process; and a voice database to store data relative to the voice characteristics of authorized users.
Usually, while carrying out voice authentication, a voice authentication system needs to match the voice harmonic and resonant frequencies of the speaker, as well as the way the speaker pronounces phonemes (a language's smallest distinctive sounds) against the digital voiceprint of an authorized user. The voiceprint is created when the authorized user enrolls in the voice authentication system, and subsequently stored as a digital file in a voice database of the voice authentication system. The voice authentication system calculates a score that indicates how closely the speaker's voice matches the stored voiceprint for the person the speaker claims to be, thereby determining whether the speaker is who he claims to be.
In the implementation of voice authentication technology, although using chips can quickly process the large amount of information involved in voice authentication, at present a general method is to leverage a portable software system to implement voice authentication functionality.
Conventional voice authentication systems are always based on client-server architecture, which requires huge storage and powerful processors to store data and perform pattern-matching technologies, to compare live speech with stored voiceprints of authorized users on server side. Furthermore, the information of voice templates is usually much larger than other kinds of biometric information. This makes fast servers and quick filtering software a must. At the same time, this makes the time required to authenticate a user very long. So there arises the need to implement voice authentication functionality on client side with limited resources. If a voice authentication system adopts the voice authentication on client side, voice data, such as voiceprint, may be stored in a removable storage medium called voice ID card, such as that based on a smart card. When a user is required to be authenticated, a voice ID card is provided to the authentication system by the user, and the client matches the user's voice to the voice data stored in the voice ID card, thereby implementing the voice authentication. To inspire confidence and encourage more widespread adoption, however, the above-mentioned voice authentication system using removable storage media must overcome several obstacles as below.
Firstly, the security problem of the voice ID card. The biggest problem of storing voice data, such as the voiceprints of a authorized user, in a removable storage medium is the security of the removable storage medium itself, as it is prone to be lost, stolen, and abused.
Secondly, the problem of data-hacking prevention. As systems adopting voice authentication technology all relate to confidential information, and technologies that allow access to confidential systems have come forth at present, there are concerns about whether hackers could compromise voice authentication systems. For example, it is possible to cheat an ordinary voice authentication system of authentication by playing a recording of someone's voice. Nowadays, many sophisticated systems create detailed voiceprint information that would not match readily with a recorded voice. Voices generated by some high-precision voice imitators, though, could still fool a pure voice authentication system in many cases.
Thirdly, the problem of consistent accuracy. Voice authentication is the least accurate biometric-security system. In real-world use, users' behavioral and environmental factors such as background noises or changes in users' voices due to health, emotional state, fatigue, age, or other causes might reduce the accuracy of voice authentication systems. This makes a system relying on voice authentication alone as a security measure problematic. To solve this problem, researchers are taking several approaches to improve the accuracy of voice authentication. In an environment like a home with a low-end microphone and limited system resources instead of a lab environment, however, it is difficult to apply a sophisticated voice authentication system.