Voiceprint information generally refers to information that is capable of representing a speaker, and is a type of voice information that reflects physiological and behavior characteristics of the speaker through voice waveforms. The voiceprint information may be broadly applied to tasks such as speaker recognition, speaker verification, and speaker self-adaptation in speech recognition. Rapid and effective extraction of voiceprint information is very important for improving the performance of the foregoing tasks.
I-vector is a mainstream technology of speaker recognition. In i-vector, each speaker has a vector that is distinguishable from among different speakers.
Generally, i-vector needs to model a speaker space and a channel space separately, and substitute a changing factor for computation, and thereby a vector representing voiceprint information can be extracted from an input voice. The processes of training and voiceprint information extraction thereof are relatively complicated.