Speaker recognition is technology of extracting unique information of a user from an audio signal of the user and verifying whether a voice uttered when a speaker was claimed corresponds to the claimed speaker.
For speaker recognition, a feature vector, which is a unique property of an audio signal input by a user, has to be extracted from the input audio signal. Since the feature vector has a high dimension and thus requires many calculations during speaker authentication, a device for extracting a feature vector can reduce the dimension of the feature vector by converting the feature vector by linear discriminant analysis (LDA).
According to LDA, each class of a feature vector is assumed to be homoscedastic. However, each class of an actual feature vector may be heteroscedastic. Accordingly, when the dimension of a feature vector is reduced according to LDA, the performance of a speaker recognition system may be degraded due to an assumption that is different from the assumption of actual data.