In the field of computer-based speech recognition, speaker verification refers to the task of determining whether a speech sample of an unknown voice corresponds to the voice of a particular enrolled user. One of the challenges in implementing speaker verification pertains to the user enrollment/training process. For text-dependent speaker verification systems (i.e., systems that depend on a predefined passphrase), a user can train and thus enroll his/her voice by uttering the passphrase a small number of times. However, such systems are limited because they can only recognize an enrolled user based on his/her utterance of the passphrase (rather than general speech). Further, although the training process is relatively short, problems can still emerge if extraneous words or sounds are captured during this process, which can lead to frustration on the part of the user.
Text-independent speaker verification systems are more flexible than text-dependent systems because they can recognize an enrolled user without requiring the user to say a particular passphrase. This allows for different types of verification methods, such as “continuous verification” while the user is speaking. At the same time, the enrollment/training process for text-independent systems is typically much longer and more intensive. For example, such systems typically require a user to enroll his/her voice by uttering a large number of phonetically balanced sentences in order to model all possible phrases that the user might speak at the time of verification. As a result, the training process for a text-independent speaker verification system can be extremely burdensome for its users.