By means of phase vocoders [1-3] or other techniques for time or pitch modification algorithms such as Synchronized Overlap-Add (SOLA), audio signals can for example be modified with respect to the playback rate, whereas the original pitch is preserved. Moreover, these methods can be applied to carry out a transposition of the signal while maintaining the original playback duration. The latter can be accomplished by stretching the audio signal with an integer factor and subsequent adjustment of the playback rate of the stretched audio signal applying the same factor. For a time-discrete signal, the latter corresponds to a down sampling of the time stretched audio signal about the stretching factor given that the sampling rate remains unchanged.
Phase vocoder based bandwidth extension methods like [4-5] generate, in dependency of the necessitated overall bandwidth, a variable number of band limited sub bands (patches) which are summed up to form a sum signal which exhibits the necessitated overall bandwidth.
The temporal alignment of the single patches which result from the phase vocoder application turns out to be a specific challenge. In general, these patches have time delays of different durations. This is because the synthesis windows of the phase vocoders are arranged in fixed hop sizes which are dependent on the stretching factor, and therefore every individual patch has a delay of a predefined duration. This leads to a frequency selective time delay of the bandwidth extended sum signal. Since this frequency selective delay affects the vertical coherence properties of the overall signal it has a negative impact on the transient response of the bandwidth extension method.
Another challenge is presented by considering the individual patches, where a lack of cross frequency coherence has a negative impact of the magnitude response of the phase vocoder.