The invention relates to a method of determining a pitch in a signal, deriving from the signal a probability density function of the pitch as a function of frequency and subsequently determining the pitch from the probability density function, as well as an arrangement for implementing the method.
Such a method and arrangement are known from the publication "An optimum processor theory for the central formation of the pitch of complex tones" by J. L. Goldstein, J.A.S.A., Vol. 54, No. 6 (1973), pp. 1496-1516.
It is a known fact that persons are able to recognize different pitches in a complex tone. Experiments have shown that pitch is a non-deterministic, subjective magnitude which is to be modelled stochastically. For a sine tone the probability density function of the experienced pitch is unimodal. This is to mean that no more than one maximum is found in the curve. This probability density function can be modelled as a Gaussian curve having a mean value corresponding with the frequency of the sine tone and a specific standard deviation .sigma..
For a complex tone the situation is more complicated. Persons are able to perceive two kinds of pitches in a complex tone, depending on whether they perceive the sound as a whole (synthetic listening) or listen to the individual partial tones (analytic listening). In the case of synthetic listening we may hear pitches that correspond with frequencies that do not occur in the signal. These virtual pitches are described by a multimodal probability density function. If one takes, for example, a complex tone constituted by two sines having frequencies of 1200 Hz and 1400 Hz, one will not only perceive a pitch of 200 Hz (basic tone) but also pitches at 173 Hz and 236 Hz. In this case the probability density function is trimodal, thus has three maximums. This perceptive behavior is described among other things by the model for pitch perception proposed by Goldstein in his aforementioned article.
Goldstein's model is based on a stochastic formulation predicting the multimodal probability density function of the perceived virtual pitch. In his model each spectral component in the perceived signal is represented by a stochastic variable which has a Gaussian probability density function having a mean value corresponding with the frequency of the spectral component. The standard deviation of the probability density function is a free parameter of the model which function only depends on the spectral frequency. In Goldstein's model, when a complex tone is presented, a sample is determined from each Gaussian probability density function. With these samples a pattern recognizing means performs an estimate of the (lacking) basic tone. This process then results in a multimodal probability density function of the virtual pitch. Although the model can be employed reasonably well for describing the virtual pitch in signals, it does have Several serious disadvantages. For example, the probability density function can only be computed for signals constituted by no more than two sines. For signals constituted by more than two sines the probability density function can be determined only by means of a Monte Carlo simulation. The model can further be used only for determining the probability density function if one knows in what octave the pitch is located.