It is desirable to be able to evaluate the perceived spatial quality of audio processing, coding-decoding (codec) and reproduction systems without needing to involve human listeners. This is because listening tests involving human listeners are time consuming and expensive to run. It is important to be able to gather data about perceived spatial audio quality in order to assist in product development, system setup, quality control or alignment, for example. This is becoming increasingly important as manufacturers and service providers attempt to deliver enhanced user experiences of spatial immersion and directionality in audio-visual applications. Examples are virtual reality, telepresence, home entertainment, automotive audio, games and communications products. Mobile and telecommunications companies are increasingly interested in the spatial aspect of product sound quality. Here simple stereophony over two loudspeakers, or headphones connected to a PDA/mobile phone/MP3 player, is increasingly typical. Binaural spatial audio is to become a common feature in mobile devices. Home entertainment involving multichannel surround sound is one of the largest growth areas in consumer electronics, bringing enhanced spatial sound quality into a large number of homes. Home computer systems are increasingly equipped with surround sound replay and recent multimedia players incorporate multichannel surround sound streaming capabilities, for example. Scalable audio coding systems involving multiple data rate delivery mechanisms (e.g. digital broadcasting, internet, mobile comms) enable spatial audio content to be authored once but replayed in many different forms. The range of spatial qualities that may be delivered to the listener will therefore be wide and degradations in spatial quality may be encountered, particularly under the most band-limited delivery conditions or with basic rendering devices.
Systems that record, process or reproduce audio can give rise to spatial changes including the following: changes in individual sound source-related attributes such as perceived location, width, distance and stability; changes in diffuse or environment related attributes such as envelopment, spaciousness and environment width or depth. In order to be able to analyse the reasons for overall spatial quality changes in audio signals it may also be desirable to be able to predict these individual sub-attributes of spatial quality.
Under conditions of extreme restriction in delivery bandwidth, major changes in spatial resolution or dimensionality may be experienced (e.g. when downmixing from many loudspeaker channels to one or two). Recent experiments involving multivariate analysis of audio quality show that in home entertainment applications spatial quality accounts for a significant proportion of the overall quality (typically as much as 30%).
Because listening tests are expensive and time consuming, there is a need for a quality model and systems, devices and methods implementing this model that is capable of predicting perceived spatial quality on the basis of measured features of audio signals. Such a model needs to be based on a detailed analysis of human listeners' responses to spatially altered audio material, so that the results generated by the model match closely those that would be given by human listeners when listening to typical programme material. The model may optionally take into account the acoustical characteristics of the reproducing space and its effects on perceived spatial fidelity, either using acoustical measurements made in real spaces or using acoustical simulations.