The present invention generally relates to apparatus and methods for providing speaker recognition.
Voice-based speaker recognition (or verification) is an important component of personal authentication systems that are employed in controlling access to devices and services. For example, in telephone banking, an individual may provide a claim (e.g., his or her name) either by using the telephone keypad or by saying it. Subsequently, an automated system may either prompt the user to issue an utterance (password, answer to a question, etc.). The utterance can be analyzed and compared to the voice-print of the claimed person previously stored in a database. As a result of this comparison, the speaker could be either accepted or rejected. Other possible applications of voice-based speaker verification include, for example: computer access; database access via computer, cellphone or regular telephone; ATM access; and credit card authorization via telephone.
Typically, in voice-based speaker verification, a sample of the voice properties of a target speaker is taken and a corresponding model (i.e., a voiceprint) is built. In order to improve the system robustness against impostors, it is also usually the case that a large number of non-target speakers (xe2x80x9cbackground speakersxe2x80x9d) are analyzed, pre-stored as voiceprint models, and then used to normalize the voiceprint likelihood scores of a target speaker. The discriminative power of the voiceprint models is crucial to the performance of the overall verification system. An example of a conventional arrangement may be found in D. A. Reynolds, xe2x80x9cSpeaker identification and verification using Gaussian mixture speaker models,xe2x80x9d Speech Communication 17 (1995), pp. 91-108.
A need has been recognized, however, in connection with providing voice-based speaker verification that displays even greater system robustness in the face of impostors than has hitherto been the norm.
In accordance with at least one presently preferred embodiment of the present invention, a xe2x80x9ccohort selectionxe2x80x9d technique is employed in a different manner. Conventionally, cohort selection techniques involve the comparison of the target speaker""s data to voice-prints of its closest background neighbors (cohorts) and to use this information for normalization purposes. In accordance with at least one preferred embodiment of the present invention, however, once the closest voice-prints are selected into a cohort set, the dissimilarity of the cohort models is increased using linear feature transforms. The transforms may be derived either from data relating to the target speaker only or from data relating to all speakers in the cohort, including the target speaker. A combination of these two alternatives is also contemplated herein.
The process contemplated herein are believed to contribute to improving the distinction power of the above-mentioned models by employing linear feature transforms derived from specific target speaker data and/or from the specific target""s cohort speakers. The inventive processes may be used in a wide range of applications supporting voiceprint based authentication (e.g., as described in U.S. Pat. No. 5,897,616 to Kanevsky et al., entitled xe2x80x9cApparatus and Methods for Speaker Verification/Identification/Classification Employing Non-Acoustic and/or Acoustic Models and Databases).
In one aspect, the present invention provides a method of facilitating speaker verification, the method comprising the steps of providing target data relating to a target speaker; providing background data relating to at least one background speaker; selecting from the background data a set of cohort data having at least one proximate characteristic with respect to the target data; and combining the target data and the cohort data to produce at least one new cohort model for use in subsequent speaker verification.
In another aspect, the present invention provides an apparatus for facilitating speaker verification, the apparatus comprising: a target data store which supplies data relating to a target speaker; a background data store which supplies data relating to at least one background speaker; a selector which selects from the background data a set of cohort data having at least one proximate characteristic with respect to the target data; and a modeller which combines the target data and the cohort data to produce at least one new cohort model for use in subsequent speaker verification.
In an additional aspect, the present invention provides a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for facilitating speaker verification, the method comprising the steps of: providing target data relating to a target speaker; providing background data relating to at least one background speaker; selecting from the background data a set of cohort data having at least one proximate characteristic with respect to the target data; and combining the target data and the cohort data to produce at least one new cohort model for use in subsequent speaker verification.
In a further aspect, the present invention provides a method of facilitating verification, the method comprising the steps of providing target data relating to a target individual; providing background data relating to at least one background individual; selecting from the background data a set of cohort data having at least one proximate characteristic with respect to the target data; and combining the target data and the cohort data to produce at least one new cohort model for use in subsequent verification.
In another aspect, the present invention provides an apparatus for facilitating verification, the apparatus comprising: a target data store which supplies data relating to a target individual; a background data store which supplies data relating to at least one background individual; a selector which selects from the background data a set of cohort data having at least one proximate characteristic with respect to the target data; and a modeller which combines the target data and the cohort data to produce at least one new cohort model for use in subsequent verification.
Furthermore, the present invention provides in another aspect a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for facilitating verification, the method comprising the steps of: providing target data relating to a target individual; providing background data relating to at least one background individual, selecting from the background data a set of cohort data having at least one proximate characteristic with respect to the target data; and combining the target data and the cohort data to produce at least one new cohort model for use in subsequent verification.