Some computer systems employ text dependent speaker recognition, wherein a speaker (person) utters a predefined or known phrase, such as “hello computer,” for which the system has been trained to recognize the speaker. Text independent speaker recognition, in which the speaker's utterances are unconstrained, presents a more challenging problem. Existing text independent speaker recognition systems generally require a lengthy speaker enrollment procedure that can require five or more minutes of a user's time to generate a model of their voice, which can be burdensome. Additionally, such techniques produce a model that is tailored to the specific context or environment in which the enrollment procedure is carried out and typically do not perform well in other contexts.
Although the following Detailed Description will proceed with reference being made to illustrative embodiments, many alternatives, modifications, and variations thereof will be apparent in light of this disclosure.