The replay speed of audio signals can be changed while maintaining the pitch, for example with the help of a phase vocoder (see for example J. L. Flanagan and R. M. Golden, “The Bell System Technical Journal”, November 1966, pages 1394 to 1509; U.S. Pat. No. 6,549,884 Laroche, J. & Dolson, M.: “Phase-vocoder pitch-shifting”; Jean Laroche and Mark Dolson, “New Phase-Vocoder Techniques for Pitch-Shifting, Harmonizing And Other Exotic Effects”, Proc. 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, N.Y., Oct. 17-20, 1999). In the same way, with such methods transposition of the signal can be performed while maintaining the original replay duration. The latter is obtained by replaying the stretched signal accelerated by the factor of time stretching. In time discrete signal representation, this corresponds to down-sampling the signal by the stretching factor while maintaining the sampling frequency. Conventionally, this time stretching takes place in the time domain. Alternatively, the same can also take place within a filter bank, such as a pseudo-quadrature mirror filterbank (pQMF). The pseudo-quadrature mirror filterbank (pQMF) is sometimes also called a QMF filterbank.
Specific challenges in stretching are transient events that are “blurred” in time during the processing step of time stretching. This occurs because methods, such as the phase vocoder, affect the so-called vertical coherence properties (with regard to a time frequency spectrogram representation) of the signal.
Some current methods stretch the time more around the transients, in order to not have to perform any or only little time stretching during the duration of the transient. This has been described, for example, in:                Laroche L., Dolson M.: Improved phase vocoder timescale modification of audio”, IEEE Trans. Speech and Audio Processing, vol. 7, no. 3, pp. 323-332        Emmanuel Ravelli, Mark Sandler and Juan P. Bello: Fast implementation for non-linear time-scaling of stereo audio; Proc. of the 8th Int. Conference on Digital Audio Effects (DAFx'05), Madrid, Spain, Sep. 20-22, 2005        Duxbury, C., M. Davies, and M. Sandler (2001, December). Separation of transient information in musical audio using multi resolution analysis techniques. In Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-01), Limerick, Ireland.        
Another paper on the topic was written by Röbel, A.: A NEW APPROACH TO TRANSIENT PROCESSING IN THE PHASE VOCODER; Proc. of the 6th Int. Conference on Digital Audio Effects (DAFx-03), London, UK, Sep. 8-11, 2003.
In time stretching of audio signals by phase vocoders, transient signal portions are “blurred” by dispersions, since the so-called vertical coherency in spectrogram view of the signal is affected. Methods operating with so-called overlap-add methods can generate spurious pre echoes and post echoes of transient sound events. These problems can be handled by changing time stretching in the environment of transients, no stretching during the actual transients and stronger stretching in the surrounding. If, however, transposition is to take place, the transposition factor will no longer be constant in the environment of the transients, i.e. the pitch of superimposed (possibly tonal) signal portions changes in a spuriously audible manner. When time stretching takes place within a filter bank, such as the pQMF, similar problems occur.
The field of this application relates to a method for perceptually motivated handling of transient sound events within such a process. In particular, transient sound events may be removed during signal manipulation of time stretching. Subsequently, a precisely fitting addition may be performed of the unprocessed transient signal portion to the changed (stretched) signal under consideration of the stretching.