Compression of digital data is essential to improve the capacity of digital transmission systems. Voice data presents particular challenges. When the speaker pauses, the silence between words is often encoded in the same way as active speech. This produces repetitive output which wastes available transmission bandwidth. This problem is especially keen during multi-party teleconferences when only one party is speaking while the others remain silent.
A commonly used audio compression algorithm is the G.723.1 standard promulgated by the International Telecommunication Union. This system is particularly geared for digital multimedia applications. This standard specifies the coding of audio to reduce the amount of digital information required to reproduce the original audio input. This standard has transmission rates of 5.3 kbits/second and 6.3 kbits/second. Audio is broken into 30 msec time frames. There is a look ahead of 7.5 msec, resulting in a total algorithmic delay of 37.5 msec. The coder is designed to operate with a digital signal obtained by first performing telephone bandwidth filtering of the analog input, then sampling at 8000 Hz and then converting to 16-bit linear PCM for the input to the encoder. The output of the decoder should be converted back to analog by similar means. The encoder operates on 240 samples per frame. Each frame is divided into four subframes of 60 samples each. For each frame containing speech, a twenty to twenty-four byte output is generated. Every frame containing the spectral characteristics of silence is represented by a four byte output. In other words, for a three second pause, 100 four byte data output is created. A need exists for a method of further compressing audio input, particularly silence. Such a method should improve upon the G.723.1 standard.