Time-scale modification (TSM) is an emerging topic in audio digital signal processing due to the advance of low-cost, high-speed hardware that enables real-time processing by portable devices. Possible applications include intelligible sound in fast-forward play, real-time music manipulation, foreign language training, etc. Most time scale modification algorithms can be classified as either frequency-domain time scale modification or time-domain time scale modification. Frequency-domain time scale modification provides higher quality for polyphonic sounds, while time-domain time scale modification is more suitable for narrow-band signals such as voice. Time-domain time scale modification is the natural choice in resource-limited applications due to its lower computational cost.
A primitive time-domain time scale modification method known as overlap-and-add (OLA) overlaps and adds equidistant and equal-sized frames of the signal after changing the overlap factor to extend or reduce its time duration. A more sophisticated method known as synchronous overlap-and-add (SOLA) achieves considerable quality improvement by evaluating a normalized cross-correlation function between the overlapping signals for each overlap position to determine the exact overlap point. This process is called overlap adjustment loop. The synchronous overlap-and-add time scale modification method requires high computational resources for the cross-correlation and normalization processes. Several methods have been proposed to reduce the computational cost of the overlap adjustment loop of the synchronous overlap-and-add time scale modification method. These include: global-and-local search time scale modification (GLS-TSM) which limits the search to just a few candidates; and envelope-matching time scale modification (EM-TSM) which calculates the cross-correlation using only the sign of the signals.