The present invention relates to audio signal processing and more particularly to time and/or pitch shifting of an audio signal.
It is desirable to modify the duration of an audio signal while retaining a natural sound or modify the pitches in an audio signal without changing the duration. One application is video synchronization. One often needs to adjust the duration of a recording to make it fit exactly the duration of the video clip without modifying the pitch. Acceptable duration discrepancies are less than 20%. On the other hand, pitch scaling is often used to slightly adjust the pitch of a recording before mixing it with other recordings.
For professional audio applications, time/pitch scaling techniques must meet high quality standards. It is also desirable to perform the necessary computations in real time.
Time-scaling and pitch-scaling are in some respects the same problem. In order to increase the pitch of a signal by 1%, one can extend the signal's duration by 1% and resample the extended signal at a rate 1% higher than the original rate.
Perhaps the simplest method of time-scaling is the splice method. Modifying the duration of a signal without altering its pitch requires that some samples be created (for time-expansion) or discarded (for time-compression). The splice method generally consists of regularly duplicating or discarding small pieces of the original signal, and using cross-fading to conceal the discontinuity caused by the duplicating or discarding operation.
Unfortunately, the splice method tends to generate conspicuous artifacts, mainly because the splice points and the duration of the discarded/duplicated segments are fixed parameters, and no optimization is permitted.