Technical Field
This invention relates to an n-gram language model and, more specifically, to a building of an n-gram language model for automatic speech recognition (ASR) system.
Background Art
In recent years, spoken data is rapidly increasing. The spoken data is generated in a variety of applications, such as call monitoring, market intelligence gathering, customer analytics, and on-line media search.
Spoken term detection is a key information retrieval technology which aims to conduct an open vocabulary search over large collections of the spoken data.
Call monitoring using the spoken term detection can be performed for all calls. In a call monitoring operation at a call center, for example, a specific word or an inappropriate statement is reviewed from a large number of speech calls to improve the quality of the call center, to ensure compliance, or to evaluate communicators (e.g., customer service representatives, telephone sales representatives, etc.).
In general, the spoken term detection is accomplished by converting speech data to text data using an automatic speech recognition (hereinafter also referred to as “ASR”) system and searching for the text data.
Some customers of the call monitoring set a high value on precision and others set it on recall, according to a customer's demand or business operation.
Basically, the precision and the recall are in a trade-off relationship. Therefore, when one of either the precision and the recall is increased, the other tends to decrease.
Therefore, either of the precision or recall for keywords will be improved according to a customer's demand or business operation.
Many methods are known for calculating confidence scores for words after the ASR and the selected words whose score is higher than a threshold, such as in Jonathan Mamou, et. al., “Vocabulary independent spoken term detection”, In Proc. of ACM SIGIR, 2007 available at http://researchweb.watson.ibm.com/haifa/projects/imt/sir/papers/sigir07.pdf). These methods can improve the precision of the spoken term detection.