In low bit rate voice coders, degradation of voice quality is often due to inaccurate voicing decisions. The difficulty in correctly making these voicing decisions lies in the fact that no single speech classifier can reliably distinguish voiced speech from unvoiced speech. The use of multiple voiced detectors and the selection of one of these detectors to make the determination of whether the speech is voiced or unvoiced is disclosed in the paper of J. P. Campbell, et al., "Voiced/Unvoiced Classification of Speech with Applications to the U.S. Government LPC-10E Algorithm", IEEE International Conference on Acoustics, Speech, and Signal Processing, 1986, Tokyo, Vol. 9.11.4, pp. 473-476. This paper discloses the utilization of multiple linear discriminant voiced detectors each utilizing different weights and threshold values to process the same speech classifiers for each frame of speech. The weights and thresholds for each detector are determined by utilizing training data. For each detector, a different level of white noise is added to the training data. During the processing of actual speech, the detector to be utilized to make the voicing decision is determined by examining the signal-to-noise ratio, SNR. The range of possible values that the SNR can have is subdivided into subranges with each subrange being assigned to one of the detectors. For each frame, the SNR is calculated, the subrange is determined, and the detector associated with this subrange is selected to make the voicing decision.
A problem with the prior art approach is that it does not perform well with respect to a speech environment in which characteristics of the speech itself have been altered. In addition, the method used by Campbell is only adapted to white noise and cannot adjust for colored noise. Therefore, there exists a need for a method of selecting between a plurality of voiced detectors that allows detection in a varying speech environment.