Voice signals are quasi-periodic. In other words, when viewed over a short time interval a voice signal appears to be composed of a substantially repeating segment. The time period of the repetition of the segment is referred to as a pitch period.
The periodicity (also known as harmonicity) of a signal is a measure of the degree to which the signal exhibits periodic characteristics, in other words it is a quality measure of how regularly recurrent the signal is. Some signals are periodic even when viewed over long time intervals, for example pure tones. Such signals have a very high degree of periodicity. Other signals are not periodic, for example noise signals. Such signals have a very low degree of periodicity. Voice signals are quasi-periodic. They exhibit a high degree of periodicity if the periodicity is measured over short time intervals.
Pitch period and/or periodicity estimates of a signal are used in many applications in speech processing systems. For example, such estimates are often used in speech noise reduction processes, speech recognition processes, speech compression processes and packet loss concealment processes.
An estimate of the periodicity of a signal is often used to distinguish the voicing status of the signal. If the periodicity is low, the signal is considered to be unvoiced speech or noise. If the periodicity is high, the signal is considered to be voiced.
The estimate of the pitch period of a signal may, for example, be used to aid in selecting a replacement packet of data in a packet loss concealment process.
Many methods are used to estimate the pitch period and periodicity of a voice signal. Generally, these methods include use of an autocorrelation algorithm. Suitable algorithms include the average magnitude difference function (AMDF), the average squared difference function (ASDF), and normalised cross-correlation function (NCC). For a typical one of these methods, the calculations involved in estimating the pitch period or periodicity account for over 90% of the algorithmic complexity in the overall technique, for example the pitch based waveform substitution technique. Although the complexity level of the calculation is low, it is significant for low-power platforms such as Bluetooth.
To efficiently compute an autocorrelation sequence, a Fourier Transform of the power spectrum of the signal is commonly used. However, the frequency domain approach is more memory intensive than direct calculation in the time domain and is only more efficient for longer input signal lengths and when a full autocorrelation is needed. For determining an estimated periodicity or pitch period of a signal on a resource constrained embedded platform, a direct time domain calculation is normally preferred.
To reduce computational load of a time domain approach, one commonly adopted approach is to perform pitch period estimation in two phases. ITU-T Recommendation G.711 Appendix 1, “A high quality low-complexity algorithm for packet loss concealment with G.711” proposes such a system. In the first phase, a coarse search is performed over the entire predefined range of pitch periods to determine a rough estimate of the pitch period. In the second phase, a fine search is performed over a refined range of pitch periods encompassing the rough estimate of the pitch period. A more accurate refined estimate of the pitch period can therefore be determined. The number of calculations that the algorithm computes is therefore reduced compared to an algorithm that performs a fine search over the entire predefined range of pitch periods.
Although this approach reduces the number of calculations that the algorithm computes, the computational complexity associated with estimating the pitch period remains a problem, particularly with low-power platforms such as Bluetooth.
There is thus a need for an improved method of estimating the pitch period or periodicity of a signal that reduces the computational complexity associated with the estimation.