The invention relates to speech coding, such as for computerized speech recognition systems.
In computerized speech recognition systems, an acoustic processor measures the value of at least one feature of an utterance during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. For example, each feature may be the amplitude of the utterance in each of twenty different frequency bands during each of series of 10-millisecond time intervals. A twenty-dimension acoustic feature vector represents the feature values of the utterance for each time interval.
In discrete parameter speech recognition systems, a vector quantizer replaces each continuous parameter feature vector with a discrete label from a finite set of labels. Each label identifies one or more prototype vectors having one or more parameter values. The vector quantizer compares the feature values of each feature vector to the parameter values of each prototype vector to determine the best matched prototype vector for each feature vector. The feature vector is then replaced with the label identifying the best-matched prototype vector.
For example, for prototype vectors representing points in an acoustic space, each feature vector may be labeled with the identity of the prototype vector having the smallest Euclidean distance to the feature vector. For prototype vectors representing Gaussian distributions in an acoustic space, each feature vector may be labeled with the identity of the prototype vector having the highest likelihood of yielding the feature vector.
For large numbers of prototype vectors (for example, a few thousand), comparing each feature vector to each prototype vector consumes significant processing resources by requiring many time-consuming computations.