The present invention generally relates to a speaker verification system, and more particularly, to a speaker verification system in which segmentation and feature extraction are performed by extracting speaker-specific information from an input speech signal.
In speaker verification, it is not effective to compare an input speech wave directly with a registered speech wave. Normally, the input speech signal is converted into acoustic parameters such as spectral data and linear predictive coefficients. Other examples of acoustic parameters are pitch frequency, voice energy (power), hormant frequency, PARCOR coefficient, logarithmical cross section ratio, and zero-crossing number.
These acoustic parameters primarily include phonemic information and secondarily includes individuality information. From this viewpoint, it is desired to extract speaker-specific features from acoustic parameters.
Some speaker verification systems designed to extract features having individuality information have been proposed. For example, a Japanese Laid-Open Patent Application No. 61-278896 proposes the use of a filter designed to extract individuality information from an input speech signal. Extracted features are compared with individuality information stored in a dictionary. In the proposed system, it is necessary to employ a filter designed specifically for extracting speaker-specific features, separately from a filter for extracting phonetic information. However, there is a disadvantage in which it is difficult to obtain a sufficient amount of information on speaker-specific features. In addition, there is an increased need for more hardware.
Another proposal is disclosed in the following document: S. K. Das and W. S. Mohn, "A SCHEME FOR SPEECH PROCESSING IN AUTOMATIC SPEAKER VERIFICATION", IEEE Transactions on Audio Electroacoustics, Vol.AU-19, No.1, March 1971, pp.32-43. The document discloses that an input speech is segmented based on a power thereof. However, it is to be noted that segmentation positions are manually determined by visually finding power changes.