Conventionally, as one means for personal identification of a system user, voice authentication has been put into practical use. Recently, fingerprint authentication that is one biometrics authentication is often used for personal authentication in electronic commerce or the like in a mobile environment. However, a special sensor is separately required for fingerprint authentication. In the case of performing personal authentication using a voice, for example, in a mobile telephone, a microphone that is a sensor has already been provided, so that a mobile terminal is expected to be applied to voice authentication in a mobile environment.
The voice authentication is roughly classified into two kinds: a text-dependent type and a text-free type. According to the text-dependent type, a user is previously urged to read out a keyword (password) or a phrase to register a voice thereof, and at a time of authentication, the user is urged to utter the same keyword or phrase as that at a time of registration, whereby authentication is performed. According to the text-free type, authentication is performed only with characteristics of speaker irrespective of speech content. Thus, in the case of the text-free type, it is not necessary to determine a keyword or the like, so that the user can perform registration and authentication with arbitrary speech content. The present invention relates to the former text-dependent voice authentication.
In the text-dependent voice authentication, authentication processing is performed based on both the characteristics of speaker and the confidential information on a speech content (keyword, etc.), so that relatively high authentication accuracy is obtained. However, in an environment in which another person is present nearby at a time of authentication, there is a possibility that he/she may hear a secret keyword. Therefore, due to the reluctance of a user, it is difficult to adopt text-dependent voice authentication for the application (e.g., personal identification at a time of payment with a mobile telephone containing a payment function at a cash desk of a shop, an automatic vending machine, etc.) in which authentication is performed in an environment with no privacy.
Furthermore, in the case where a keyword is revealed, the confidentiality of speech content cannot be used for authentication, so that authentication accuracy is degraded. There is also a possibility that another person may record a secret keyword uttered by a user with a tape recorder or an IC recorder without proper authorization, and reproduce the keyword at a time of authentication, thereby succeeding in fraud (recording and reproduction fraud).
In order to solve these problems, for the purpose of preventing recording and reproduction fraud, a method for detecting that a voice has been reproduced from a loudspeaker based on phase difference information of the voice (see Patent Document 1), a method for detecting recording and reproduction by comparing transmission characteristics and superimposing a DTMF signal on a voice to insert a voice watermark (see Patent Document 2), etc. are proposed. There is also a system for preventing recording and reproduction fraud by urging a user to utter a text with a content varying for each authentication (see Patent Document 3).
Furthermore, there is proposed a method for preventing fraud even if a password is revealed, by registering a plurality of kinds of passwords so as to associate them with indexes, and urging a user to input an index corresponding to a password together with the password at a time of authentication (Patent Document 4). As means for preventing an identification number used for voice authentication from being revealed to the surrounding, there is also proposed a method for preventing the leakage of an identification number by displaying a screen on which a color is allocated to each number, and urging a user to utter the name of a color at a time of authentication (Patent Document 5).
There is also a method for previously preparing a number of kinds of input orders of numerical digits and urging a user to designate and input one kind among them at a time of authentication in a system for performing authentication of an operator with a voice input of a plurality of numerical digits (Patent Document 6). Furthermore, there is also a system for preventing the leakage of a secret password by instructing a user to utter a secret symbol string in a deformed manner (Patent Document 7). A voice response recognition apparatus is also known, which prevents the leakage of an identification number by instructing a user to insert dummy numbers in an identification number to be input by voice at random (Patent Document 8).
Patent Document 1: JP 2001-109494 A
Patent Document 2: JP 2002-514318 A
Patent Document 3: JP 2000-148187 A
Patent Document 4: JP 2000-181490 A
Patent Document 5: JP 2002-311992 A
Patent Document 6: JP 59(1984)-191645 A
Patent Document 7: JP 63(1988)-231496 A
Patent Document 8: JP 63(1988)-207262 A
However, even if the recording and reproduction fraud countermeasures as described in the above-mentioned Patent Documents 1-3 are taken, a password has already been revealed at a time of being recorded, so that authentication accuracy is degraded. Furthermore, in order to prevent the leakage of a password or conceal the password, as in the methods described in the above-mentioned Patent Documents 4-8, an alteration of speech content and a special operation are required, which makes it difficult for a user to use the methods. Furthermore, in the case of designating a speech content for each authentication, confidential information on a speech content (what has been uttered) cannot be used for authentication, so that there is a problem that high accuracy cannot be obtained.