There exist some technologies that analyzes emotions of a speaker by analyzing the sounds uttered by the speaker. Technology related to analyzing emotions is disclosed in Japanese Laid-open Patent Publication Nos. 2004-317822, 2008-170820, 2009-3162, 08-30290, and 05-119792, for example. Such technology analyzes emotions by using quantities such as the average power of an utterance and its deviation, the average fundamental frequency of an utterance and its deviation, and the timing of silent intervals.
For example, there exists technology that takes prosodic components such as the volume and fundamental frequency as feature parameters of an utterance, and analyzes emotions of a speaker on the basis of how much the feature parameters deviate from statistical quantities for approximately the last second versus statistical quantities for approximately the last five seconds.
There also exists technology that determines whether or not a speaker is in a strained state by determining whether or not periodic fluctuations are observed in the amplitude envelope. Additionally, there exists technology that recognizes the validity of the last speech recognition process by detecting a unique utterance in which the fundamental frequency and power are equal to or greater than given threshold values.
There also exists technology that determines an emergency and conducts a speech recognition process adapted to an emergency upon detecting a case where the fundamental frequency of an utterance is higher than normal, a case where the power of an utterance is larger than normal, or a case where the speed of an utterance is faster than normal.