The invention relates to electronic devices, and more particularly to speech encoding, transmission, storage, and decoding/synthesis methods and circuitry.
Commercial digital speech systems and telephony, including wireless and packetized network, continually demand increased speech coding quality and compression. This has led to ITU standardized methods such as G.729 and G.729 Annex A for encoding/decoding speech using a conjugate structure algebraic code-excited linear-prediction (CS-ACELP) method. Further, standard G.729 Annex B provides additional compression for silence frames and is to be used with G.729 and G.729 Annex A. In particular, Annex B provides a voice activity detector (VAD), discontinuous transmission, and comfort noise generator to reduce the transmission bit rate during silence periods, such as pauses during speaking.
G.729 and G.729 Annex A use 10 ms frames, and the Annex B VAD makes a voice activity decision every frame to decide the type of frame encoding; see FIG. 3 which illustrates high level functionality of G.729 Annex B. With voice activity detected, encode the frame with G.729 or G.729 Annex A. However, with no voice activity detected, either transmit a silence insertion descriptor (SID) frame or do not transmit.