The present invention generally relates to systems for converting a voice signal to a pitch signal (hereinafter simply referred to as a voice signal to pitch signal conversion systems), and more particularly to a voice signal to pitch signal conversion system which converts an input voice signal into a pitch signal in accordance with a pitch of the input voice signal.
Recently, a voice signal processing such as voice recognition, speech synthesis and the like are used in various fields. As one kind of voice signal processing, there is a voice signal processing in which the pitch of the voice is detected and the voice is converted into a sound of a scale. This kind of voice signal processing can be used in systems such as a system wherein a song sung by a student is automatically marked by converting a voice signal of the song sung by the student into a pitch signal and comparing the pitch signal with a reference pitch signal which is obtained by converting a voice signal of the song sung by a teacher, a system wherein a voice signal of a song or the like is converted into a sound of a predetermined instrument by converting the voice signal into a pitch signal, a system wherein a music of a song or the like is automatically displayed or printed by converting a voice signal of the song into a pitch signal, and a system wherein an apparatus is controlled responsive to a pitch signal which is obtained by converting a command voice signal.
Conventionally, as methods of carrying out the voice signal to pitch signal conversion, there are methods which employ a waveform processing, methods which employ correlation, and methods which employ spectrum processing.
As an example of the method which employs the waveform processing, there is the zero detection method which carries out the voice signal to pitch signal conversion by use of a repetition pattern of a number of zero crossings. However, according to the zero detection method, it is impossible to carry out an accurate voice signal to pitch signal conversion because a noise component crossing the zero is also detected. In addition, when carrying out the signal processing according to the zero detection method by use of a microcomputer, there is a problem in that the program of the microcomputer becomes complex.
As examples of the methods which employ the correlation, there are the autocorrelation method which carries out the voice signal to pitch signal conversion by detecting a peak of an autocorrelation function of the voice signal waveform and the modified autocorrelation method which carries out the voice signal to pitch signal conversion by use of an autocorrelation function of a residual signal in an LPC analysis (linear-prediction). However, in order to carry out the signal processing according to the autocorrelation method or the modified autocorrelation method, it is necessary to use a memory having a large memory capacity, an analog-to-digital converter, an operation circuit having a complex construction and the like.
As examples of the methods which employ the spectrum processing, there are the cepstrum analysis method which carries out the voice signal to pitch signal conversion by separating fine structure and an envelope of the spectrum by Fourier transform of a logarithm of the power spectrum and the period histogram method which carries out the voice signal to pitch signal conversion by obtaining a histogram of harmonic components of a fundamental frequency on the spectrum and determining the pitch from a common measure of the harmonics. However, it takes time to detect the pitch when the cepstrum analysis method or the period histogram method is used, and these methods are disadvantageous in that it is impossible to carry out the signal processing in real time.