In the testing of digital speech-transmission equipment, objective and automatic measurement techniques are ever more preferred to subjective techniques requiring the utilization of human listeners, this preference being chiefly due to the greater speed and decreased costs. The disadvantage of using electronic measurement techniques is that human listeners are generally more accurate in pinpointing the quality of the systems tested.
In automatic techniques a suitable test signal is fed to the speech-transmission system and a resulting output signal is compared instant by instant with the input signal. The crucial problems are those concerning the choice of the input signal and of the quantities or characteristics to be analyzed. Sinusoidal input signals have yielded satisfactory results in the testing of pulse-code-modulation (PCM) systems; however, in systems having a lower rate of bit transmission, such as those of the differential type, it is preferable to use a signal resembling speech signals as much as possible, e.g. having spectral characteristics similar to those of the average voice.
Signal-to-noise ratios, e.g. the ratio of the power of the input signal to the power of an error signal such as the difference between the output and input signals, have proven to be good indication of system quality for PCM devices. In the case of differential systems two other quantities have provided better results: (1) the segmental signal-to-noise ratio, i.e. an average of the signal-to-noise ratios for a predetermined number of intervals into which the test period is divided, and (2) a frequency-weighted segmental signal-to-noise ratio in which the signal spectrum is divided into a multiplicity of frequency bands and signal-to-noise ratios are computed for each band, the segmental ratio being a weighted mean of the calculated values. The latter measure is particularly useful in testing low-bit-rate coders which exhibit a spectral shaping of quantization noise.
It has been found that the various quality-measuring procedures require particular input signals to achieve optimal results. Thus, for example, frequency weighting is incompatible with input signals having stationary or uniform characteristics corresponding to those of average-speech signals. Optimal results in the use of frequency weighting are obtained from utilizing actual human-speech segments or of preshaped noise segments; hoever, a plurality of voiced-speech signals must be used and the results averaged to obtain reliable outcomes.
Speech-transmission equipment introduces attenuations (or gains), linear filtering, delays, etc. which do not affect the signal quality but do affect the objective measurement of that quality. This is one reason for not using the input test signal in the computation of transmission characteristics.
Another factor frequently introducing error into objective quality measurements is the necessity of maintaining perfect synchronization between the input and output signals of the speech-transmission equipment.
At the present state of the art, components of speech-transmission systems are generally studied through computer-implemented simulation. Errors arise because a mathematical model of the system is being tested rather than the component itself. The characteristics of the transmission equipment under test are rarely known to the extent that they can be accurately quantified in a model.