It is desirable to use objective, repeatable, performance metrics to assess the acceptability of performance at the design, commissioning, and monitoring stages of communications services provision. However, a key aspect of system performance is the measurement of subjective quality, which is central in determining customer satisfaction with products and services. The complexity of modern communications and broadcast systems, and in particular the use of data reduction techniques, renders conventional engineering metrics inadequate for the reliable prediction of perceived performance. Subjective testing using human observers is expensive, time consuming and often impractical, particularly for field use. Objective assessment of the perceived (subjective) performance of complex systems has been enabled by the development of a new generation of measurement techniques, which emulate the properties of the human senses. For example, a poor value of an objective measure such as signal-to-noise performance may result from an inaudible distortion. A model of the masking that occurs in hearing is capable of distinguishing between audible and inaudible distortions.
The use of models of the human senses to provide improved understanding of subjective performance is known as perceptual modelling. The present applicants have a series of previous patent applications referring to perceptual models, and test signals suitable for non-linear speech systems, including WO 94/00922, WO 95/01011 and WO 95/15035.
To determine the subjective relevance of errors in audio systems, and particularly speech systems, assessment algorithms have been developed based on models of human hearing. The prediction of audible differences between a degraded signal and a reference signal can be thought of as the sensory layer of a perceptual analysis, while the subsequent categorisation of audible errors according to their subjective effect on overall signal quality can be thought of as the perceptual layer.
An approach similar to this auditory perceptual model has also been adopted for a visual perceptual model. In this case the sensory layer reproduces the gross psychophysics of the sensory mechanisms, in particular spatio-temporal sensitivity (known as the human visual filter), and masking due to spatial frequency, orientation and temporal frequency.
A number of visual perceptual models are under development and several have been proposed in the literature.
The subjective performance of multi-modal systems depends not only on the quality of the individual audio and video components, but also on interactions between them. Such effects include “quality mis-match”, in which the quality presented in one modality influences perception in another modality. This effect increases with the quality mis-match.
The information content of the signal is also important. This is related to the task undertaken but can vary during the task. For present purposes, “content” refers to the nature of the audio-visual material during any given part of the task.
The type of task or activity undertaken also has a substantial effect on perceived performance. As a simple example, if the video component dominates for a given task then errors in the video part will be of greatest significance. At the same time audio errors which have high attentional salience (are “attention grabbing”) will also become important. The nature of the task undertaken influences the split of attention between the modalities, although this may also vary more randomly if the task is undemanding.
However, important though these factors are, they are in general difficult to define, and to use for making objective measurements. Nevertheless, the inventor has identified some cross-modal effects which can be derived from objective measurements.