In the context of voice communication over a digital network, the input audio signal is typically encoded using a speech codec such as the well-known Adaptive Multi-Rate (AMR) codec. In such applications, it is useful to detect which frames in the digital bitstream contain speech and which frames contain non-speech audio, an undertaking referred to as Voice Activity Detection (VAD). But that can be a non-trivial processing task that involves decoding the AMR signal back to uncompressed audio signals in linear PCM format, extracting features from them and running complex algorithms. The AMR codec does have its own inherent VAD module that is used to enable discontinuous transmission (DTX), but it is designed to be very conservative so it is not robust to high noise and it is not configurable.