Until approximately 1995, the 44.1 kHz sampling rate of the Compact Disc (CD) was regarded by most people as entirely adequate. Since 1995, the ‘hi-res’ movement has adopted sampling frequencies of 96 kHz, 192 kHz or higher, potentially allowing audio bandwidths of 40 kHz, 80 kHz or more. It has always been something of a puzzle as to why there should be any audible advantage in the bandwidth extension, since the CD's sampling rate of 44.1 kHz allows near-perfect reproduction of audio frequencies up to 20 kHz, the generally-accepted upper frequency limit of human hearing.
Superior time-resolution has been advanced as a possible explanation of the apparent paradox, and a recent paper by J. R. Stuart and P. G. Craven “A Hierarchical Approach to Archiving and Distribution” presented at the Audio Engineering Society Convention, Los Angeles, 11 Oct. 2014 [AES preprint no. 9178], explains this concept and cites several Neuroscience references that support this view.
According to this view, the impulse response of a recording and reproduction chain should be as compact in time as possible. Experience indicates that audible pre-responses are particularly undesirable and the above-cited reference presents an argument as to why this might be the case.
The many existing recordings stored at 44.1 kHz have generally either been made using an oversampling analogue-to-digital converter providing a 44.1 kHz output, or they have been explicitly downsampled from a recording made at a higher sampling rate. Filtering is required in both cases and until recently it was generally considered better to use linear phase filtering. Unfortunately, linear phase filtering always introduces pre-responses.
In the case of recordings made at sample rates such as 88.2 kHz or higher, the pre-response can be reduced by “Apodising” as described in Craven, P. G., “Antialias Filters and System Transient Response at High Sample Rates” J. Audio Eng. Soc. Volume 52 Issue 3 pp. 216-242; March 2004.
Typically an 88.2 kHz sampled system will have an antialias filter that cuts steeply at 40 kHz or some slightly higher frequency. The solution proposed in the paper is to ‘apodise’, that is to filter more gently starting at 20 kHz or a slightly higher and tapering down to zero by about 40 kHz. The sharp band-edge above 40 kHz is thereby rendered innocuous, since the apodising filter has removed the signal energy at frequencies that would provoke ringing or pre-responses. There remains some pre- and/or post-response from the apodising filter itself, but this can be much shorter in time since its transition band, from 20 kHz to 40 kHz, is much wider.
The situation is much less favourable for 44.1 kHz recordings. For these recordings it has generally been considered ideal to use a downsampling or antialias filter with a response flat to 20 kHz and then cutting sharply to be essentially zero by the Nyquist frequency of 22.05 kHz. It is thus not possible for an apodising filter to taper the response gently to zero by the frequency of the sharp-cut filter unless the apodising filter starts to taper at a lower frequency such as 15 kHz, which is not generally considered acceptable. Sometimes it is possible to improve the sound by a filter that begins to roll off at 20 kHz but in general there is a danger that an apodiser constrained thus will simply replace one band-edge by another nearly as sharp and at a slightly lower frequency.
What is needed therefore is an improved or alternative technique to minimise the undesirable audible effects of pre-responses, especially for signals that have been stored or will be transmitted at a relatively low sampling rate such as 44.1 kHz.