The invention concerns a method of transforming a speech signal which is separated into two signal parts a, b, where a represents the quasistationary part of the signal with information on the formant frequencies, and b represents a residual signal, the transient part of the signal, containing information on pitch frequency and stop consonants, the signal b being produced by inverse filtration of the speech signal.
Such a method is known from U.S. Pat. No. 5,060,258 and from articles by U. Hartmann, K. Hermansen and F. K. Fink: "Feature extraction for profoundly deaf people", D.S.P. Group, Institute for Electronic Systems, Alborg University, September 1993, and by K. Hermansen, P. Rubak, U. Hartman and F. K. Fink: "Spectral sharpening of speech signals using the partran tool", Alborg University.
As described in the above articles, a speech signal is divided into two signal parts, one of which is described by a spectrum, and the other is a time signal. The spectral signal may be calculated on the basis of LPC (linear predictive coding), on the basis of FFT transformation or in another manner. The spectrum produced by the analysis is divided into a plurality of second order parallel sections, and as disclosed by the articles, the sections are characterized by three parameters, which are the resonance frequency f.sub.o, the Q value ##EQU1## and the power of the spectral part which is about the frequency f.sub.o. With these three parameters it is possible to transform (i.e. manipulate) the LPC or FFT spectrum. Further, this signal is typically composed of so-called formants, which are resonance frequencies in the vocal tract, or put differently, the signal describes a considerable part of the information content of a speech signal.
The second signal produced via an LPC analysis (inverse filtration) is a residual signal which in respect of voiced sounds is indicative of the tone or pitch of a speech signal, which is typically in the range from 100 to 300 Hz. For example, a male voice has a low frequency, while a female voice has a somewhat higher value. The above-mentioned tone frequencies or pitch frequencies are defined as the-number of pulses per second which are generated by the vocal chords.
Now, by means of the two subsignals it is possible to manipulate speech signals in several ways for use in many applications, as will appear from the following.
For example, transformation of speech signals of the above-mentioned type may be used for:
a) Changing the sound picture with a view to improving the speech intelligibility in noisy environments for persons having normal as well as impaired hearing ability. PA0 b) Changing the sound picture with a view to improving the speech intelligibility and comfort of persons with severely impaired hearing. PA0 c) Simulating hearing losses, e.g. for use in the testing of hearing aids.
As mentioned, according to the above-mentioned articles, the great advantage of the transformation of speech signals is that it is possible manipulate the formant frequencies as well as the residual signal independently of each other. The fact is that if a complete speech signal is compressed/expanded by more than 10% (for persons with normal hearing), the speech quality will be partially destroyed. This restriction does not apply to the same extent, if the pitch signal is maintained and the formant frequencies are reduced.
However, it has been found that the signal processing according to the above-mentioned articles may be improved. If, for example, a door slams, a hearing-impaired person carrying a hearing aid of any type can easily get an unpleasant surprise, because the circuit of the hearing aid is not sufficiently fast to attenuate this sudden signal.
In the circuit mentioned in the articles above, a so-called sound transient, such as e.g. the slam of a door, will substantially not be modeled by the LPC analysis, but will occur in the residual signal as a rather strong pulse.
Accordingly, it is the object of the invention to eliminate this noise signal in the residual channel.