Speaker identification authenticates a user from a biometric characteristic. For speaker identification, the biometric sample is compared to all records within the database and a closest match score is returned. The closest match within an allowed threshold is deemed the individual and authenticated. Thus, speaker identification is the task of determining an unknown speaker's identity, such that speaker identification is a 1:N match where the voice is compared against N templates.
Known solutions focus on providing speech-to-text solutions identifying what is being said, or require custom hardware to indicate when a pre-designated speaker is vocalizing. For example, known solutions monitor a “one microphone per speaker” circuit and provide visual notifications when the circuit is active for a particular speaker. Additionally, known solutions provide for speaker identification after a pre-enrollment step designed to establish a baseline voiceprint. Furthermore, know solutions translate speech to printed text.
For example, a conference call between a number of participants may be transcribed. However, the transcription will not indicate who is saying what dialogue. Thus, for a user, e.g., a hearing-impaired user, the transcription may be useless as the user cannot determine who said what in an ongoing dialogue. As a further example, a television program may contain closed-captioning. However, the closed-captioning will not indicate who is saying what dialogue. Rather, the closed-captioning contains the transcribed text without attribution to the speaker.
Furthermore, known solutions do not provide a visualization interface for augmenting speaker identification of an unknown number of users without pre-enrollment of voiceprints. Moreover, known solutions may require a library of known speakers, may require a separate microphone for each speaker, and/or may require segmented speech.
Accordingly, there exists a need in the art to overcome the deficiencies and limitations described hereinabove.