The phase-vocoder has long been a popular tool for high-quality audio effects such as time-scaling, pitch-shifting, analysis/modification/synthesis and so on.
The phase-vocoder is based on calculating Fast Fourier Transforms of overlapping windowed portions of an incoming signal, processing the frequency-domain representation thus obtained, and re-synthesizing an output signal by means of overlapping windowed inverse Fourier transforms. In practice, the bulk of the computation cost lies in the calculations of the (usually) large Fourier transforms (for a 48 kHz audio signal, 4096 point Fourier transforms are typical). The Fourier transforms yield a convenient decomposition of the signal into frequency channels that span the entire frequency range from 0.0 Hz to half the sampling rate. This is usually more than one really needs. For example, audio signals typically have most of their energy in the low frequency area (between 0.0 and 12 kHz for example) and the high-frequencies usually contain incoherent signals (such as noise, transients and so on). Unfortunately, the standard phase-vocoder operates on the entire frequency region, which means that a significant fraction of the computation cost is spent to no benefit.