Generally, speech processing systems deal with a variety of signals with varying intensity levels. Exemplary speech processing systems may include mobile phones, audio recorders, Voice over Internet Protocol (VOIP) systems etc. A person using the speech processing systems may speak at different audible levels at different instants in time. The variation in audio/speech signals may occur when the person changes the position with respect to the microphone of the speech processing system or if there is sudden and transient increase in the audio level. Such transient increase in the audio level may exceed the dynamic range of the audio processing system, thereby producing distorted audio output.
The term “peak limiting”, commonly used in signal processing, handles such signal bursts or transients in the audio signals. Further, the signal level is maintained below some predefined threshold, particularly during such transients. This has been a common practice for audio signal processing that is needed for audio content production and listening requirements.
In existing methods, the focus has been on to reduce the distortions caused in the audio quality during the peak limiting process. One generic approach to handle the transients is to delay the signals sufficiently such that future transients are anticipated and attenuated in time. In the audio signal used for entertainment, there was less focus on reducing the processing delay of the signals. However, for a voice communication system for interactivity and reducing the impact of acoustic echo feedback, it is desired that signal processing delay be minimal or preferably no delay should be introduced.
Further, a major section of voice communication systems is packet based communication like Voice over IP (VoIP) system. In the packet based communication, the speech signal is processed at block level or frame level. Hence, there is need for a method that can handle the speech signal transients without introducing any delay and with minimal distortion in signal quality, while processing at frame level as desired in the existing signal flow in the voice communication systems.
The systems and methods disclosed herein may be implemented in any means for achieving various aspects. Other features will be apparent from the accompanying drawings and from the detailed description that follow.