Transmission of voice by digital techniques has become widespread, particularly in long distance and digital radio telephone applications. This, in turn, has created interest in determining the least amount of information that can be sent over a channel while maintaining the perceived quality of the reconstructed speech. If speech is transmitted by simply sampling and digitizing, a data rate on the order of 64 kilobits per second (kbps) is required to achieve a speech quality of conventional analog telephone. However, through the use of speech analysis, followed by the appropriate coding, transmission, and re-synthesis at the receiver, a significant reduction in the data rate can be achieved. The more accurately speech analysis can be performed, the more appropriately the data can be encoded, thus reducing the data rate.
Devices that employ techniques to compress speech by extracting parameters that relate to a model of human speech generation are called speech coders. A speech coder divides the incoming speech signal into blocks of time, or analysis frames. Speech coders typically comprise an encoder and a decoder, or a codec. The encoder analyzes the incoming speech frame to extract certain relevant parameters, and then quantizes the parameters into binary representation, i.e., to a set of bits or a binary data packet. The data packets are transmitted over the communication channel to a receiver and a decoder. The decoder processes the data packets, de-quantizes them to produce the parameters, and then re-synthesizes the speech frames using the de-quantized parameters.
Modern speech coders may use a multi-mode coding approach that classifies input frames into different types, according to various features of the input speech. Multi-mode variable bit rate encoders use speech classification to accurately capture and encode a high percentage of speech segments using a minimal number of bits per frame. More accurate speech classification produces a lower average encoded bit rate, and higher quality decoded speech. Previously, speech classification techniques considered a minimal number of parameters for isolated frames of speech only, producing few and inaccurate speech mode classifications. Thus, there is a need for a high performance speech classifier to correctly classify numerous modes of speech under varying environmental conditions in order to enable maximum performance of multi-mode variable bit rate encoding techniques.