Many professions require verbal interaction with clients, for example sales people, bank clerks, and help desk support personnel. Typically, the ability to communicate with the client is not only affected by speaking the same language but also affected by being able to understand the client's accent. In some cases people speaking the same language cannot understand each other because of the accents they are accustomed to. Some accents are considered clearer than others and may be preferred or required for use by people in certain professions, for example television and radio broadcasters.
Nowadays many companies provide telephonic support services, wherein human employees are trained to speak with a clear accent. In some cases these services are outsourced to foreign countries, wherein a foreign language is spoken. The employees performing the service are trained to speak the required language with a desired accent.
Typically a person easily identifies in a short time if another person is speaking with the same accent as accepted in their geographical location or speaks with a different accent. Some people can identify a person's geographical origin based on their accent.
Accent training, and monitoring a trainee's progress is generally expensive and requires individual attention.
U.S. application Ser. No. 10/996,811 filed Nov. 23, 2004, the disclosure of which is incorporated herein by reference, describes a statistical method for speaker spotting in order to split a conversation into separate parts containing the speech of each speaker.
In an article by Yeshwant K. Muthusamy et al. titled “Reviewing Automatic Language Identification” IEEE Signal Processing Magazine October 1994, there is described automated methods of language identification.
In an article by Marc A. Zissman et al. “Comparison of Four Approaches to Automatic Language Identification of Telephone Speech”, IEEE Transactions on Speech and Audio Processing, Vol. 4, No. 1, January 1996, there is also described automated methods of language identification.
The above articles describe preparing a statistical model to represent a speech segment or collection of speech segments.
In Frederic Bimbot et al., there is described “A Tutorial on Text-Independent Speaker Verification”, EURASIP Journal on Applied Signal Processing 2004: 4, 430-451. This article describes the use of statistical method for identifying a user.
Prior art machines dealing with accents are typically used to train a person's accent by requiring the person to repeat a specific word or phrase and comparing the answer to a known digitized pattern. These machines are limited to specific phrases and are not applicable to non pre-selected speech.