This invention relates generally to methods and apparatus for objective perceptual quality measurement of an audio signal, and more particularly to methods and apparatus for measuring distortions introduced in silent passages by processing of speech signals.
Some objective measures of speech signal quality are known. For example, International Telecommunications Union (ITU) standard P.861 for Perceptual Speech Quality Measurement (PSQM) of voice signals is a perceptual objective algorithm for measuring quality of voice signals. This quality measurement is of interest, for example, when compressing and decompressing a voice signal through speech codecs.
Known perceptual speech quality measurement algorithms require both an original and a processed signal to be available. For example, PSQM computes a xe2x80x9cperceptual differencexe2x80x9d between an original and a processed signal to give an objective value that can be mapped to a Mean Opinion Score (MOS). PSQM and other known algorithms operate on active speech portions of the original signal. However, the assumption that only active speech portions contribute to an MOS value is correct only under special conditions. For example, when one attempts to characterize distortion introduced by a new speech compression algorithm, one simply processes an original speech signal through a codec and measures a difference between the original speech signal and the processed signal. There is very little distortion content during silent periods in such processing, resulting in no contribution by such periods to a MOS value.
However, when one is attempting to characterize an effect of other types of processors, for example, noise cancelers, distortions introduced during silence periods of speech signals are of considerable interest. It is of interest, for example, to determine whether a noise canceler blocks, removes, or reduces background noise in an original signal. More particularly, effects of noise cancellation are most noticeable during non-active, or silent, portions of a speech signal, as these are the portions in which a background signal annoyance is most readily perceived. Therefore, an unmodified PSQM algorithm does not provide a satisfactory indication of noise cancellation effectiveness in a MOS.
It would therefore be desirable to provide methods and apparatus that provide a satisfactory indication of noise cancellation effectiveness. It would further be desirable to provide methods and apparatus that provide a MOS indication of noise cancellation effectiveness. More generally, it would be desirable to provide methods and apparatus for evaluating a measure of MOS for silent periods of any processed speech signal to evaluate the effectiveness and/or usefulness of the processing applied to a speech signal.
The present invention is therefore, in one aspect, a method for evaluating perceptual quality of a processed signal obtained by processing an original signal having silent periods. The method includes steps of determining silent portions and speech portions of the original signal and corresponding silent portions and speech portions of the processed signal, and evaluating the silent portions of the processed signal as a function of amounts of energy contained in the silent portions of the processed signal, corresponding silent portions of the original signal, and an amount of energy in speech portions of the original signal. In one embodiment, the original signal and the processed signal are segmented into frames, frames of the original signal that represent speech and frames of the original signal that represent silence are identified, and the evaluation produces a mean opinion score (MOS). The present invention is, in another aspect, a corresponding device configured to perform steps of an embodiment of the method, and in another aspect, a machine-readable medium configured to instruct a processor to perform steps of an embodiment of the method.
It will be recognized that the present invention, in each of its aspects and embodiments, can be employed to provide measures of noise cancellation effectiveness, and can be used to provide a MOS indication of noise cancellation effectiveness. More generally, the present invention provides evaluations, such as a MOS evaluation, for silent periods of any processed speech signal to evaluate the effectiveness and/or usefulness of the processing applied to a speech signal.