WO 2004/084176 (PTL 1) discloses a method for objectively evaluating sound such as tone color, timbre, subjective diffuseness, apparent source width, etc. using factors extracted from auto correlation functions (hereinafter “ACF factors”) and factors extracted from interaural crosscorrelation functions (hereinafter “IACF factors”).
A conventional method in speech recognition technologies is to obtain a feature vector of a speech signal by analyzing an input speech signal for overlapping short period analysis segments (frames) in a fixed time interval, and to perform speech matching based on time-domain signal of the feature vector.
Many methods have been offered for analyzing these feature vectors, with typical methods including spectrum analysis and cepstrum analysis.