The portrayal of audio ‘scenes’ by means of a plurality of audio channels is well-known. The use of two channels (stereo sound) is commonplace; the use of six or more channels is now expected in many applications, including television broadcasts. Such multi-channel audio systems require synchronisation between the constituent audio signals in order to provide the intended sound field in a listening environment. Typical systems for the production and distribution of multi-channel audio content make use of digital representations of audio samples. A delay difference of one digital sample period (typically 23 μs) between two constituent audio signals is unacceptable.
Many digital audio processes make use of asynchronous transport and processing systems where audio samples are processed at a rate different from that at which the samples are intended to be presented to a listener, and/or the rate at which the samples were acquired in a recording environment.
An important example is television production. The outputs from a number of microphones may be digitised as 16-bit words at a sample rate of, say 44.1 kHz; these audio words may be combined in a multiplex with 10-bit video samples at a total rate of, say 27 MWord/s; the audio words may then be de-multiplexed to an intermediate multiplex at, say 50 kWord/s; re-multiplexed into another 27 MWord/s video stream; and, finally, de-multiplexed to a nominal 44.1 kHz rate and output. Typically buffer stores are used to manage the handover between the parts of the system operating at different sample rates; if a sample is lost, or unintentionally duplicated, the buffer delay for the affected audio channel will differ from the buffer delay of other channels, and a lack of audio synchronism will result. The complex audio routing in modern audio-visual production requires the combination of audio and video sources from different sources, and changes to these combinations increase the possibility of loss or duplication of samples, with consequent introduction of relative delay differences between audio channels.
Typical buffer stores are controlled by input and output clock signals which control the ‘writing’ and ‘reading’ of data at the input and output respectively. The ‘fullness’, or occupancy of the buffer is the cumulative difference between the number of write clocks and the number of read clocks. It is usually arranged that buffers have some average occupancy that is appropriate to the expected frequency variations in the read and write clocks. This average occupancy represents the propagation delay of the buffer. Audio processes where different audio channels are separately buffered require a mechanism to ensure equal average buffer occupancy for the constituent audio channels of an audio scene.