1. Field of the Invention
The present invention relates to a speaker verification system using voice, and more particularly, to a speaker verification system using the voice of a user uttering a continuous, random length digit string.
2. Description of the Related Art
Security verification of users requesting access to particular service is required in various application fields. For example, it is necessary to confirm whether users requesting access to services, such as a banking service, a credit card inquiry service and electronic commerce, have authority to use the service. A representative method of user verification is a method of using passwords or personal identification numbers (PINs). The method of using passwords has a disadvantage in that users must accurately memorize the passwords. Users tend to use passwords which are easy to remember for the convenience of memorization. Therefore, the passwords can be easily deduced and misappropriated.
To overcome the above problems, interest in biotechnological security techniques, that is, techniques of identifying individual persons based on personal biotechnological features, has increased. As a representative method, there is a method of identifying users based on users"" voices. The method of identifying users based on personal voice features is more advantageous in terms of price or implementation than other methods, for example, a method of identifying users based on finger prints. By using the method of identifying users based on voice for security verification in electronic commerce through a telephone network or the internet, users can be immediately identified based on their voice without a disadvantage such as memorization of a password. In addition, control of access to sites employing security systems can be made very easy by using voice verification technology. The method of identifying users through voice performs voice verification by comparing various voice parameters of particular voices. The parameters include pitch cycle, voice intensity and other acoustic features.
However, conventional voice verification systems employ a text-dependent method, that is, a method of using a previously memorized reference script and requiring a user to speak the reference script. Since a text, which is to be spoken by a user, is fixed, misappropriation of voice can easily happen by recording the user""s voice speaking the text. Moreover, in a text-independent method for overcoming the problems of the text-dependent method, management of a huge database for storing various voice features slows down processing speed.
To solve the above problems, it is a first objective of the present invention to provide a speaker verification system and method using the voice of a user uttering a continuous, random length digit string, wherein a simple digit string is used so as to improve processing speed, randomness is increased so as to remove the possibility of voice misappropriation, and a mechanical sound determination mechanism is provided, and a computer-readable medium storing a program for the system and method.
It is a second objective of the present invention to provide a speaker registration method which is implemented in the above speaker verification system and a computer-readable medium storing a program for the speaker registration method.
Accordingly, to achieve one aspect of the first objective, there is provided a speaker verification system using the voice of a user uttering a continuous, random length digit string. The speaker verification system includes a random digit generator for generating a continuous, random length digit string; a user interface for providing the continuous, random length digit string, which is generated by the random digit generator, to the user and receiving the voice of the user uttering the provided continuous, random length digit string; a feature extractor for extracting voice features from the user""s voice which is received via the user interface; a digit voice verification unit for comparing the voice features, which are extracted by the feature extractor, with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which match the voice features, and for determining whether the derived digit string is identical to the digit string, which has been provided to the user via the user interface; and a speaker verification unit for comparing the voice features, which are extracted by the feature extractor, with a speaker-dependent model of the user to measure the similarity between them, the speaker-dependent model of the user including previously determined features of the users"" voice, and for determining whether to approve or reject the user based on the similarity, when it is determined that the derived digit string is identical to the digit string which has been provided to the user.
In another aspect of the present invention, there is provided a speaker verification method using the voice of a user uttering a continuous, random length digit string. The speaker verification method includes the steps of (a) randomly generating a continuous, random length digit string; (b) providing the continuous, random length digit string to the user; (c) receiving the voice of the user uttering the continuous, random length digit string; (d) extracting voice features from the received user""s voice; (e) comparing the extracted voice features with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which match the voice features, and determining whether the derived digit string is identical to the digit string provided to the user in the step (b); and (f) comparing the voice features extracted in the step (d), with a speaker-dependent model of the user and determining whether to approve or reject the user.
In yet another aspect of the present invention, there is provided a speaker verification method using the voice of a user uttering a continuous, random length digit string. The speaker verification method includes the steps of (a) providing a continuous, random length digit string, which is randomly generated, to a user; (b) receiving the voice of the user uttering the continuous, random length digit string and extracting voice features from the received user""s voice; (c) comparing the extracted voice features with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which are matched with the voice features, and determining whether the derived digit string is identical to the digit string, which has been provided to the user in the step (a); (d) comparing the voice features, which has been extracted in the step (c), with speaker model of the user and measuring the similarity between the voice features and the speaker model of the user, the similarity indicating the difference between phonetic values; (e) increasing a first speaker rejection count when the similarity, which is measured in the step (d), is greater than or equal to a predetermined lower similarity threshold and increasing a second speaker rejection count when the similarity is greater than or equal to a predetermined upper similarity threshold; and (f) after repeating the steps (a) through (e) a plurality of times, (f1) approving the user when the first speaker rejection count is 0, and rejecting the user when the second speaker rejection count is at least 1 or the first speaker rejection count exceeds a predetermined rejection count threshold; and (f2) determining whether to approve or reject the user based on the similarity measured in the step (d), when the second speaker rejection count is 0, and the first speaker rejection count is at least 1 and less than or equal to the rejection count threshold.
In still yet another aspect of the present invention, there is provided a computer-readable recording medium for recording a program which is executed in a computer for speaker verification for verifying a user by their voice when uttering a continuous, random length digit string. The program includes the steps of (a) randomly generating a continuous, random length digit string; (b) providing the continuous, random length digit string to the user; (c) receiving the voice of the user uttering the continuous, random length digit string; (d) extracting voice features from the received user""s voice; (e) comparing the extracted voice features with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which match the voice features, and determining whether the derived digit string is identical to the digit string provided to the user in the step (b); and (f) comparing the voice features extracted in the step (d), with a speaker-dependent model of the user and determining whether to approve or reject the user.
In further another aspect of the present invention, there is provided a computer-readable recording medium for recording a program which is executed in a computer for speaker verification for verifying a user by the voice of the user uttering continuous, random length digit string. The program includes the steps of (a) providing a continuous, random length digit string, which is randomly generated, to a user; (b) receiving the voice of the user uttering the continuous, random length digit string and extracting voice features from the received user""s voice; (c) comparing the extracted voice features with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which are matched with the voice features, and determining whether the derived digit string is identical to the digit string, which has been provided to the user in the step (a); (d) comparing the voice features, which has been extracted in the step (c), with speaker model of the user and measuring the similarity between the voice features and the speaker model of the user, the similarity indicating the difference between phonetic values; (e) increasing a first speaker rejection count when the similarity, which is measured in the step (d), is greater than or equal to a predetermined lower similarity threshold and increasing a second speaker rejection count when the similarity is greater than or equal to a predetermined upper similarity threshold; and (f) after repeating the steps (a) through (e) a plurality of times, (f1) approving the user when the first speaker rejection count is 0, and rejecting the user when the second speaker rejection count is at least 1 or the first speaker rejection count exceeds a predetermined rejection count threshold; and (f2) determining whether to approve or reject the user based on the similarity measured in the step (d), when the second speaker rejection count is 0, and the first speaker rejection count is at least 1 and less than or equal to the rejection count threshold.
To achieve the second objective, there is also provided a speaker registration method in the speaker verification system according to the embodiment of the present invention. The speaker registration method includes the steps of (a) providing a continuous digit string having various phonetic values to a user; (b) receiving the voice of the user uttering the continuous digit string; (c) extracting voice features from the received user""s voice; (d) comparing the extracted voice features with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which match the voice features, and determining whether the derived digit string is identical to the digit string provided to the user in the step (a); (e) comparing the voice features extracted in the step (c), with speaker-dependent model of the user and measuring the similarity between the voice features and the speaker dependent model of the user; and (f) determining whether to register the user based on the measured similarity.
In another aspect of the present invention, there is provided a computer-readable recording medium for recording a program which is executed in a computer for speaker registration in a speaker verification system. The program includes the steps of (a) providing a continuous digit string having various phonetic values to a user; (b) receiving the voice of the user uttering the continuous digit string; (c) extracting voice features from the received user""s voice; (d) comparing the extracted voice features with items in a speaker-independent continuous digit voice model to derive a digit string corresponding to items in the speaker-independent continuous digit voice model, which match the voice features, and determining whether the derived digit string is identical to the digit string provided to the user in the step (a); (e) comparing the voice features extracted in the step (c), with speaker-dependent model of the user and measuring the similarity between the voice features and the speaker dependent model of the user; and (f) determining whether to register the user based on the measured similarity.