Prior art systems for measuring voice quality, as described by Douskalis (Hewlett Packard 2000), Royer (U.S. Pat. No. 5,710,791) and Di Pietro (U.S. Pat. No. 5,867,813), use centralized test equipment which samples the voice quality from various conversion points. A loop back condition is established at an conversion point, the test equipment transmits a known signal and then compares the received (looped back) signal with the original, thereby estimating delay, distortion and other impairments. This approach provides an accurate measure of voice distortion but only provides this measure for a sample conversion point and under the network conditions that existed at the time of the test. No provision is made in these systems for estimating the effects of the temporal location of impairments on quality.
Another approach currently used for estimating voice quality is to estimate the subjective performance of the voice connection using objectively measured parameters. Models such as the E-Model described by Johannesson, (IEEE Communications Magazine 1997), are able to produce R ratings which can be correlated to user perceived voice quality. This process is applied by a central management system which gathers statistics on noise and delay and then produces an estimate of voice quality. This method as described by Johannesson does not consider the effects of the temporal location of impairments.
Rosenbluth (T1 committee contribution T1A1.7/98-031) described the result of a subjective voice quality test conducted by AT&T. This test employed a set of 60 second voice samples which were corrupted at the beginning, middle and end. Rosenbluth reported that the MOS score typically ranged from 3.8 for a sample corrupted at the beginning to 3.2 for a sample corrupted at the end. Rosenbluth proposed a mathematical model for determining the estimated voice quality of the speech sample which comprised the steps of dividing the sample into segments, computing a MOS score for each segment, computing a weight for each segment that was a function of the proportional position of the segment within the overall sample and then computing the sum of the segment MOS scores multiplied by their respective weights. This approach suffers from several major drawbacks:                (i) Rosenbluth determined the weight for each impairment based on the proportional location of the impairment within the speech segment—i.e. determined a number between 0 and 1 that represented the position of the impairment between the start and the end of the segment. This is counterintuitive as it would give the same weight to an impairment occurring 30 seconds before the end of a one minute call to an impairment occurring 30 minutes before the end of a one hour call whereas the process by which information is lost from human memory is time dependant.        (ii) The equations used by Rosenbluth were computationally complex, which would be acceptable in an off-line determination of voice quality based on recorded data but not for real-time computation.        (iii) Rosenbluth proposed a testing methodology in which a series of short test messages of known time duration could be sent and then the results from each test message combined. Rosenbluth did not consider a test methodology in which quality was continually estimated during the normal operation of the system.        
Accordingly, there is a need for a means of estimating subjective quality in a voice communications system that considers the temporal location of impairments and is computationally efficient.
Furthermore, there is a need for a means of estimating subjective quality in a voice communications system that considers the temporal location of impairments and is able to continuously monitor voice quality during the normal operation of the communications system.
Furthermore, there is a need for a means of estimating subjective quality in a video transmission system that considers the temporal location of impairments and is computationally efficient.
Furthermore, there is a need for a means of estimating subjective quality in an audio transmission system that considers the temporal location of impairments and is computationally efficient.
Furthermore, there is a need for a means of estimating subjective quality in a distributed applications software or client-server system that considers the temporal location of impairments and is computationally efficient.