This invention relates to data compression and data decompression of acoustic features associated with sampled speech data in a speech recognition system.
Typically, an initial step in a computerized speech recognition system involves the computation of a set of acoustic features from sampled speech. The sampled speech may be provided by a user of the system via an audio-to-electrical transducer, such as a microphone, and converted from analog representation to a digital representation before sampling. An example of how these acoustic features may be computed is described in the article entitled "Speech Recognition with Continuous Parameter Hidden Markov Models," by Bahl et al., Proceedings of the IEEE ICASSP, pp. 40-43 (May 1988). These acoustic features are then submitted to a speech recognition engine where the utterances are recognized. In a speech recognition system employing a client-server model, the acoustic features are computed on the client system and then have to be transmitted to the server system for recognition. It is necessary to compress the acoustic features to minimize the bandwidth requirements for the transmission. Compression is also necessary in more general speech recognition systems where storage of the acoustic features is desired.
The topic of speech compression has been well researched over the years (e.g., "Speech Coding and Synthesis," by Klein et al., Elsevier (1995)), but all of the proposed solutions only address the problem of compressing and reproducing speech that sounds acceptable to a human ear. The problem addressed by the present invention, on the other hand, is to compress (and decompress) the acoustic features computed (i.e., extracted) from spoken utterances for the purpose of subsequent machine recognition of speech.