Detection of acoustic shock is a well-known problem in signal processing. Acoustic shock signals are referred to as impulse signals or transient signals. The nature of an impulse signal is such that its amplitude suddenly changes within a very short duration. There are two typical types of transient signals: aperiodic and periodic signals. An aperiodic impulse is for example generated by an explosion, gunfire or a firecracker. Aperiodic shocks last for very short timeframes such as 250 μs or less. On the other hand, a periodic impulse is usually generated from an impact between two mechanically and acoustically un-dampened objects such as two glass bottles hitting each other. Periodic shocks usually have much longer durations in the order of 5 to 200 ms, and also usually have a lower peak level. Periodic shocks consist of multiple peaks, which come closely one after the other and have attenuated peak levels during their duration.
Many different approaches have been developed to address the detrimental effects of such acoustic shocks. General input-output compression strategies such as WDRC (Wide Dynamic Range Compression) reacts too slowly versus the very fast acoustic shock impulses. MPO (Maximum Power Output) in the frequency domain can be applied to prevent overshooting, but it is also too slow to be effective. Peak-clipping in the time-domain such as time-domain MPO is effective and fast, but it usually causes serious distortion of sound quality. For shock detection, high pass filters are often used; since the transient noise has most of the energy at high frequencies (sudden signal changes mean rich high frequency components). Low-pass filters are also often used to attenuate the transient noise without affecting simultaneously speech content of the signal.
A sub-band-based acoustic shock algorithm has been presented by Todd Schneider et al. in EP 1,471,767. A pattern analysis-based approach is taken to an input signal to perform feature extraction. A parameter space is identified, which is corresponding to the signal space of the input signal. A rule-based decision approach is taken to the parameter space to detect an acoustic shock event and, then, the shock is removed from the input signal to generate a processed output signal. The sudden sound increase is detected with a block frame of a number of samples in time-domain and similar detection also happens within each sub-band. An additional sample delay of a further number of samples is used to allow the time-domain measurement enough time to notify the shock detection module in the frequency domain about the presence of a high level input before the shock reaches the filter bank, thus providing extra time for the shock-detection module to react. The shock detection module in the frequency domain detects if a shock has occurred in a particular sub-band, and determines the sub-band energy measurement to be used for the gain calculation. A state machine is used for sub-band shock determination by examining the sub-band energy with the shock flag. The acoustic shock phenomenon is eliminated by applying the appropriate gain reduction to the signal in each sub-band. The described technology in said document works well for some kinds of slow shocks since it samples input signal energy with a block of a number of samples of about 0.5 ms. However, many fast shocks can be much shorter than 0.5 ms and they may not be detected with this block level. The additional time-delay is required for the system, but can cause new problems for hearing aid devices since the overall signal delay over 10 ms can be perceived as noticeable acoustic latency, which is not desired.
The shock detection in the sub-band is very expensive cycle budget-wise and also difficult to synchronize with the time-domain parameter extraction. The sub-band strategy will not be able to detect very fast shocks reliably and accurately since the filter bank can smear the actual sound level change. It is also not desired to eliminate the shock in individual sub-bands, because this may cause the user to lose environmental awareness and, hence, not perceive correctly the nature of the shock, which might be very important information for the user. It is also expensive to implement this approach and to optimize this algorithm with existing hearing device technology. More complex hearing device systems may also suffer from excessive input-output latency or require very expensive computing power to process this strategy.
U.S. Pat. No. 5,579,404 describes a digital audio limiter. This signal processing system comprising components such as split-band perceptual coders that receive a peak amplitude limited input audio signal and can process the signal in such a manner that the processed signal preserves the apparent loudness of the input signal but is no longer peak-amplitude limited. In one embodiment, up-sampling is used in estimating the resultant peak amplitude and gain factors established in response to the estimated peak amplitude are applied to one or more frequency sub-bands of the processed signal.
Long signal-delay may thus be required and the duration of the delay is usually set substantially equal to the length of time required for control system to respond to PLI (Peak level increase) requiring correction. Furthermore, this technology is more focused on broadcasting or audio recording and therefore no acoustic environment factors are considered. The signal processing required for the system, particularly peak estimator (prefer to up-sample the input signal), is higher than it is available in low-power digital systems such as digital hearing devices, thus this system is not suitable for miniaturized, low-power digital devices.
A method for processing an input signal to generate an output signal, and application of said method in hearing aids and listening devices is described in US 20030031335 A1.
This method and system for defining a threshold value are described to limit the output signal of a processing unit which is fed with an input signal. According to the invention, an input-signal level is determined and the threshold value is set as a function of that input-signal level to prevent the maximum output level in the device from exceeding a predefined threshold value, which protects the user of the device from excessive noise exposure. By virtue of the fact that the threshold value is set as a function of the input-signal level, i.e. in adaptive fashion, it is also possible to limit transient noise whose level is well below the maximum value of the threshold value. As a result, when this method or system is applied in a hearing device, the hearing comfort of the wearer of the hearing device can be significantly enhanced.
Detecting acoustic transient signal well for average level or higher level acoustic environments since the threshold value can be set by defining a momentary mean level of the input signal. A time-based mean value across the magnitude of the input signal is calculated with the averaging performed over a relative long time interval which may be a time span of for instance 5 seconds.
This method uses a known fact that human speech occupies a dynamic range of about −15 to +18 dB (decibels) around the respective mean level; in quiet surroundings with little ambient noise, this mean level is about 60 to 65 dB. A minimum threshold level such as 80 dB is required in order to not affect any first spoken syllable before the mean level has returned to 60 dB.
This method can effectively detect stronger transient signal over average acoustic environment such as 60 dB. If the minimum threshold level is dropped below 80 dB, it can affect speech signal. On the other hand, the transient signal below 80 dB in the quiet acoustic environment, such as 40 dB, can result in big perceived shock since the gain at this soft input level is usually very high.
In known digital systems and devices, the above described known solutions are implemented in the form of shock reduction algorithms. Many of these are based on peak-clipping in order to minimize delay, but, which, as previously stated, usually introduce artifacts or distortion into the signal. Some more advanced peak-clipping technologies use an adaptive clipping threshold to handle different levels of shock. However, the problems of distortion or uncomfortable artificial effects still cannot be avoided with adaptive peak clipping. Some other more sophisticated shock reduction algorithms detect transient noises in both the time-domain and individual frequency bands, then apply shock reduction in specific frequency bands while keeping the normal signal in other frequency bands untouched, as presented in the above mentioned EP 1,471,767. Although many of these algorithms are quite successful for telecommunication applications, typically experienced by a user through headphones or a headset, they usually need to add more delay and require intensive computational power. Since hearing devices possess limited computational power, this restricts the application of such techniques.
Furthermore, the requirement in this regard is different in hearing devices compared to other audio devices, in that acoustic shock should never be cancelled out completely in hearing devices, even if technically possible to do so. Acoustic shock is a type of acoustic event, which belongs to the acoustic environment. It is important that any acoustic event is not taken away from the user for his safety. In extreme examples such as a gun shot or car crash, common sense dictates that the user should be able to sense such an event so that he or she can react accordingly, but this is also true in more moderate cases of acoustic shock such as dishes breaking or a door slam.