1. Field of the Invention
The present invention is in the field of voice coding. More specifically, the invention relates to a system and method for signal enhancement in voice coding that uses active signal processing to preserve speech-like signals and suppresses incoherent noise signals.
2. Description of the Related Art
The emergence of wireless telephony and data terminal products has enabled users to communicate with anyone from almost anywhere. Unfortunately, current products do not perform equally well in many of these environments, and a major source of performance degradation is ambient noise. Further, for safe operation, many of these hand-held products need to offer hands-free operation, and here in particular, ambient noise possess a serious obstacle to the development of acceptable solutions.
Today""s wireless products typically use digital modulation techniques to provide reliable transmission across a communication network. The conversion from analog speech to a compressed digital data stream is, however, very error prone when the input signal contains moderate to high ambient noise levels. This is largely due to the fact that the conversion/compression algorithm (the vocoder) assumes the input signal contains only speech. Further, to achieve the high compression rates required in current networks, vocoders must employ parametric models of noise-free speech. The characteristics of ambient noise are poorly captured by these models. Thus, when ambient noise is present, the parameters estimated by the vocoder algorithm may contain significant errors and the reconstructed signal often sounds unlike the original. For the listener, the reconstructed speech is typically fragmented, unintelligible, and contains voice-like modulation of the ambient noise during silent periods. If vocoder performance under these conditions is to be improved, noise suppression techniques tailored to the voice coding problem are needed.
Current telephony and wireless data products are generally designed to be hand held, and it is desirable that these products be capable of hands-free operation. By hands-free operation what is meant is an interface that supports voice commands for controlling the product, and which permits voice communication while the user is in the vicinity of the product. To develop these hands-free products, current designs must be supplemented with a suitably trained voice recognition unit. Like vocoders, most voice recognition methods rely on parametric models of speech and human conversation and do not take into account the effect of ambient noise.
An adaptive noise suppression system (ANSS) is provided that includes an input A/D converter, an analyzer, a filter, and an output D/A converter. The analyzer includes both feed-forward and feedback signal paths that allow it to compute a filtering coefficient, which is then input to the filter. In these signal paths, feed-forward signals are processed by a signal-to-noise ratio (SNR) estimator, a normalized coherence estimator, and a coherence mask. The feedback signals are processed by an auditory mask estimator. These two signal paths are coupled together via a noise suppression filter estimator. A method according to the present invention includes active signal processing to preserve speech-like signals and suppress incoherent noise signals. After a signal is processed in the feed-forward and feedback paths, the noise suppression filter estimator outputs a filtering coefficient signal to the filter for filtering the noise from the speech-and-noise digital signal.
The present invention provides many advantages over presently known systems and methods, such as: (1) the achievement of noise suppression while preserving speech components in the 100-600 Hz frequency band; (2) the exploitation of time and frequency differences between the speech and noise sources to produce noise suppression; (3) only two microphones are used to achieve effective noise suppression and these may be placed in an arbitrary geometry; (4) the microphones require no calibration procedures; (5) enhanced performance in diffuse noise environments since it uses a speech component; (6) a normalized coherence estimator that offers improved accuracy over shorter observation periods; (7) makes the inverse filter length dependent on the local signal-to-noise ratio (SNR); (8) ensures spectral continuity by post filtering and feedback; (9) the resulting reconstructed signal contains significant noise suppression without loss of intelligibility or fidelity where for vocoders and voice recognition programs the recovered signal is easier to process. These are just some of the many advantages of the invention, which will become apparent to one of ordinary skill upon reading the description of the preferred embodiment, set forth below.
As will be appreciated, the invention is capable of other and different embodiments, and its several details are capable of modifications in various respects, all without departing from the invention. Accordingly, the drawings and description of the preferred embodiments are illustrative in nature and not restrictive.