Telephone communication can be used to carry out most financial transactions such as banking transactions, e-commerce and others. In these transactions, a user's identity can be protected by using password control, one time password (OTP) entry, or biometric verification when a transaction is performed by voice over a transmission media such as a telephone. Identity verification can be done either by a machine or a human operator. Various security modes may be used to verify the identity of a caller. There is always a possibility that an individual who can provide a positive identification may be doing so under duress in which case a caller may be acting against his/her will. This presents a serious threat. A similar situation may arise during a cash transaction.
The current systems that detect a caller's emotion use the voice of a caller to detect the content of the speech or the change in caller's emotion in the speech. These systems use common speech models that are generated from a common database and therefore they perform the transaction over common voice models prepared over a common database. In this type of applications, training algorithms for mood models use general data and therefore common emotional features of all people in the database are extracted. As an example, an angry model can be generated from the analysis of an angry conversation. However, this increases the emotion detection error rate because of the use of a general database during the training of the model. Tone and way of speech may vary from one individual to another. An individual's angry tone of voice may be considered to be a normal speech for another individual. These differences affect the operation of the model and therefore resulting an error in identifying the mood in a speech. There is no known method where an individual's speech pattern is analyzed by using a model that is trained by using that individual's speech.