Wireless communication has proliferated over the past decade. One of the more recent areas in which wireless communication has expanded into is multi-channel audio distribution. Multi-channel audio generally refers to audio of a sound scene that was captured from multiple different directions. The captured audio in each direction represents one audio channel in the multi-channel audio. During rendering, each audio channel is sent to a separate speaker positioned within a room to ideally reproduce the audio in a more realistic manner than single-channel audio or multi-channel audio of a lesser degree.
Some of the more common multi-channel audio formats are described using two digits separated by a decimal point (e.g., 2.0, 2.1, 5.1, 6.1, 7.1, etc.). The first digit represents the number of primary audio channels, each of which is to be reproduced on a separate speaker. The second digit represents the presence of a low frequency effect (LFE) audio channel, which is to be reproduced on a subwoofer. To provide some specific examples, a 2.0 multi-channel audio format refers to two primary audio channels (or stereo sound) and no LFE audio channel, whereas a 5.1 multi-channel audio format refers to five primary audio channels and an LFE audio channel.
The clear benefit of wireless multi-channel audio distribution is that it eliminates the need for wires between an audio source and speakers. One existing technology that can be leveraged to wirelessly deliver multi-channel audio is the Institute of Electrical and Electronics Engineers (IEEE) 802.11 family of packet based wireless networks. These “WiFi” networks are ubiquitous, standardized, and can provide a large throughput, making them a good choice for wireless distribution of multi-channel audio. However, wireless distribution of multi-channel audio over such packet-based networks still presents challenges. For such a solution to compete with traditional wired systems, the solution should deliver and playback the multi-channel audio with near equal performance or better. In general, this means the solution should reproduce the multi-channel audio at the speakers with high fidelity, low delay, and perceptually tight synchronization.
Achieving high fidelity generally means zero or near-zero packet loss across the inherently lossy wireless channel. To combat packet loss, application layer forward error correction combined with some packet interleaving can be used. However, these traditional solutions typically fall short of the zero or near-zero packet loss requirement.
Low delay is usually important when the multi-channel audio is to be synced with video. In such an instance, the rendering time of the multi-channel audio with respect to the video generally should be no more than about 100 milliseconds (ms) late or no more than about 25 ms early. The asymmetric nature of this range is a result of the human audio-visual system being accustomed to audio arriving after video due to the speed of sound being slower than the speed of light. This range puts constraints on the amount of packet interleaving that can be applied to combat packet loss mentioned above.
Finally, synchronization across the speakers used to render the multi-channel audio is important because human perception of audio signals is sensitive to delays and phase shifts caused by out-of-sync playback. In general, humans can detect around 10-20 microseconds (μs) of delay and 1-2 degrees of phase difference between audio signals. At these sensitivities, 48 kHz sampled multi-channel audio (which corresponds to a sample separation of 20.8 μs) would require synchronization across speakers within one sample period. Thus, it is important to limit the difference in rendering time between speakers, referred to as “cross-jitter”. The listener should ideally perceive the combination of audio signals from the different channels as if they were being reproduced by a normal wired system. Too much cross-jitter results in echo and spatialization issues.
The present disclosure will be described with reference to the accompanying drawings. The drawing in which an element first appears is typically indicated by the leftmost digit(s) in the corresponding reference number.