Voice analytics represent the computerised processing of a (digitised) speech audio signal to extract information. Known techniques for voice analytics include:
speech recognition—converting the received speech into corresponding text. This is available in various formats, such as discrete word/continuous speech, finite vocabulary, and so on. Another form of speech recognition is phonetic analysis (searching for specific phoneme patterns in audio recordings), for example as provided by Aurix (www.aurix.com).
speaker identification—recognising the identity of the speaker from a set of possible speakers.
speaker authentication—confirming whether a speaker is who they claim to be (this can be considered as analogous to speaker identification from a set of one possible speaker).
lie detection—trying to confirm whether or not the speaker is telling the truth. This is generally performed by looking at underlying biophysical signals, such as heart-rate, imprinted onto the speech audio signal.
Speech recognition has been used for many years in handling telephone calls. This is generally done to provide an input mechanism for a caller, especially if the information to be acquired from the caller is non-numeric or where the caller cannot produce DTMF tones (or equivalent). For example, a telephone information service for theatres may first prompt a caller to state which town they are enquiring about Speech recognition is then performed on this spoken input to identify the relevant town. The caller can then be provided with listings information for the theatre(s) in the identified town.
Another (generally more recent) application of voice analytics is to investigate previous telephone calls. For example, a call centre may record all telephone calls involving the call centre. Speech recognition can then be performed on the relevant audio signals, and the resulting text stored as a record of the call. The stored text can then be searched, for example to analyse all calls where a caller asked about a particular product or service.
It is also known (although less common) to use voice analytics to support the real-time processing of telephone calls. Examples of such systems are described in the following documents:
GB 2405553—this guides a conversation taking place between a client and an agent at a call centre by detecting the information content of the conversation using voice recognition, determining a goal of the client from the detected information content, and suggesting a conversation topic to the agent to guide the conversation.
U.S. Pat. No. 7,191,133 describes using automatic speech recognition in a call centre to determine agent compliance with a script. The results may be available either after the call has completed (via call recording) or in real-time. Actions based on the compliance determination include sending the voice interaction to a quality assurance monitor for review, sending a voice or text message to the agent, updating an incentive program, etc.
US 2005/0238475 describes monitoring speech at a call centre and detecting keywords in the speech. Information can then be retrieved based on the keywords and provided automatically to the agent handling the call.
GB 2393605 detects the emotion of the speaker, for example based on speaking rate, and uses this information to provide call centre agents with appropriate scripts. US 2004/0062363 provides similar functionality, based on the assessed stress level of the caller, as measured for example by voice analysis.
Nevertheless, existing systems generally do not exploit the full power and potential of voice analytics in a telephone environment.