This application claims the benefit of United Kingdom Patent Application No. 0326043.7, filed Nov. 7, 2003, the entirety of which is incorporated herein by reference.
This invention relates to a new parameter suitable for use in non-intrusive speech quality assessment system.
Signals carried over telecommunications links can undergo considerable transformations, such as digitisation, encryption and modulation. They can also be distorted due to the effects of lossy compression and transmission errors.
Objective processes for the purpose of measuring the quality of a signal are currently under development and are of application in equipment development, equipment testing, and evaluation of system performance.
Some automated systems require a known (reference) signal to be played through a distorting system (the communications network or other system under test) to derive a degraded signal, which is compared with an undistorted version of the reference signal. Such systems are known as “intrusive” quality assessment systems, because whilst the test is carried out the channel under test cannot, in general, carry live traffic.
Conversely, non-intrusive quality assessment systems are systems which can be used whilst live traffic is carried by the channel, without the need for test calls.
Non-intrusive testing is required because for some testing it is not possible to make test calls. This could be because the call termination points are geographically diverse or unknown. It could also be that the cost of capacity is particularly high on the route under test. Whereas, a non-intrusive monitoring application can run all the time on the live calls to give a meaningful measurement of performance.
A known non-intrusive quality assessment system uses a database of distorted samples which has been assessed by panels of human listeners to provide a Mean Opinion Score (MOS).
MOSs are generated by subjective tests which aim to find the average user's perception of a system's speech quality by asking a panel of listeners a directed question and providing a limited response choice. For example, to determine listening quality users are asked to rate “the quality of the speech” on a five-point scale from Bad to Excellent. The MOS, is calculated for a particular condition by averaging the ratings of all listeners.
In order to train the quality assessment system each sample is parameterised and a combination of the parameters is determined which provides the best prediction of the MOSs indicted by the human listeners. International Patent Application number WO 01/35393 describes one method for paramterising speech samples for use in a non-intrusive quality assessment system.
This invention relates to improved parameters for a speech quality assessment system.
According to the invention there is provided a method of generating a parameter from a signal comprising a sequence of values measured from voiced portions of said signal at a sampling frequency, said parameter suitable for use in a quality assessment tool, said method comprising the steps of                a) selecting a section of said signal;        b) performing a frequency transform on said section to provide a sequence of frequency values;        c) generating a pitch frequency estimate;        d) selecting a plurality of portions of said sequence of frequency values in dependence upon said pitch frequency estimate, said portions having a frequency range and a central frequency;        e) generating an average value for each of said plurality of portions;        f) generating a section parameter in dependence upon the difference between the average value for one portion of said sequence of frequency values and the average value for a subsequent portion of said sequence of frequency values;        g) repeating steps a)-f) to provide a plurality of said section parameters and generating said parameter by generating an average in dependence upon said plurality of said section parameters.        
Said section of said sequence of values may be selected such that a pitch mark is associated with a value central to said section.
The frequency transform may comprise a Fast Fourier Transform.
The step of generating a pitch frequency estimate may comprise the steps of using pitch marks associated with said sequence of values; comparing the number of values between a value associated with a pitch mark and a value associated with an immediately preceding pitch mark with the number of values between the value associated with the pitch mark and a value associated with an immediately following pitch mark; and generating said pitch frequency estimate in dependence upon the minimum number of said values, and the sampling frequency.
The portions of said sequence of frequency values may be selected by generating multiples of said pitch frequency estimate, said multiples representing harmonics of said pitch frequency estimate; and selecting portions in which the frequency range of the portion is substantially equal to half said pitch frequency estimate; and which the central frequency of each portion is either a frequency substantially equal to one of said multiples, or a frequency substantially half way between two of said multiples.
The invention also provides a method of training a quality assessment tool comprising the step of training a mapping for use in a method of assessing speech quality in a telecommunications network, such that a fit between a quality measure generated from a plurality of parameters for a signal and the mean opinion score associated with said signal is optimised by said mapping wherein said plurality of parameters includes a parameter generated according to any on of the preceding claims.
The invention also provides a method of assessing speech quality in a telecommunications network comprising the steps of generating a parameter according to any one of the preceding claims; generating a quality measure in dependence upon said parameter.
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which: