1. Field
The following description relates to speech emotion recognition, and to an apparatus and a method for emotion recognition from speech that involve analyzing changes in voice data, detecting frames that contain relevant information, and recognizing emotions using the detected frames.
2. Description of Related Art
Emotion recognition improves accuracy of personalized services, and plays an important role for the development of a user-friendly device. Research on emotion recognition is being conducted with a focus on facial expressions, speech, postures, biometric signals, and the like. A frame-based speech emotion recognition technology has been developed, which analyzes changes in voice data and detects frames that contain information. The speech emotion recognition technology targets the speaker's entire speech data. However, an emotion of the speaker is generally exhibited only momentarily during a speech, and not constantly throughout the entire time duration of a speech. Thus, for speech data collected for most purposes, the emotion of the speaker as indicated by his or her voice is neutral and unrelated to an emotion for a large proportion of the speech duration. Such neutral voice data is irrelevant to the emotion recognition apparatus or method, and may be considered as mere neutral noise information that hinders with the emotion recognition of the speaker. Due to the presence of the neutral voice data, the existing speech emotion recognition apparatuses and methods have difficulties in accurately detecting the exact emotion of a speaker that appears only momentarily during the entire speech.