A vector quantizer is a coding/decoding method and apparatus for representing data in a highly compact form for transmission or storage purposes. The vector quantizer can also reconstruct a close approximation of the original data when necessary. The process of coding data via a vector quantizer is referred to as vector quantization.
Conceptually, a vector quantizer operates by accepting a large set of input vectors and, in lieu of transmitting or storing each vector, the vector quantizer chooses the nearest approximation to each input vector from a set of predetermined output vectors and transmits or stores the output vector's representation. To state it another way, vector quantization is a mapping of input vectors to a set of output vectors. Further, instead of transmitting or storing the entire output vector, it is generally the case that a code which represents the output vector is employed. For example, if the total number of output vectors in an output set is 128, then a binary code with as few as 7 bits is used to identify a particular output vector. Similarly, if the output vector set contains 256 vectors, then a binary code of at least 8 bits is required.
The relationships between the input vectors, the output vectors and the binary code comprise a codebook. Using this codebook, the original input signal is associated to an output vector which can be coded into the binary code and transmitted or stored. Reconstruction of the original signal is then accomplished by the receiving or retrieval of the binary code. By use of the codebook once again, the binary code will map to the appropriate output vector. It will be appreciated that the receiver does not reconstruct the exact original input signal, but rather a very close approximation.
Recently, vector quantization has received considerable attention due to the dramatic bit rate reduction it can produce for complex signals while retaining high (original) quality and intelligibility. In particular, signals that have a high level of redundancy are extremely susceptible to vector quantization data compression methods. A set of signal input vectors is considered to have a high level of redundancy if the vectors tend to form "clusters" in an N-dimensional signal input vector space, where N is the number of scalar elements comprising an input vector. Clustering is defined as the grouping of vectors close to one another in the vector space. Therefore, in the N-dimensional vector space, certain regions contain large concentrations of signal input vectors. Clustering may be contrasted to a uniform distribution, where the signal input vectors are evenly distributed within the N-dimensional vector space.
Signal redundancy may be exploited to compress the input vectors and thus save appreciable bandwidth if transmission or storage of the input vectors is required. For example, if a set of signal vectors is highly redundant it is efficient to represent a "clustered" vector group by one representative output vector. Instead of transmitting or storing a large number of slightly different input vectors, one representative output vector close to all of the input vectors in a clustered group can be transmitted or stored.
Clearly, using a vector quantization method, a representation of a set of input vectors can be transmitted or stored much more efficiently than the original set. It will be appreciated that a crucial step in effective vector quantization is the design of the vector quantizer, i.e., determining the particular output vectors that will represent the input vectors. One goal of the design process is to produce a set of output vectors that closely represent the input vectors so that the decoded vector information exhibits a minimum loss of fidelity from the original information.
Generally, the design of a vector quantizer is accomplished by feeding a long training sequence of signal input vectors into a vector quantizer design system. From a training sequence, a vector quantizer codebook is constructed to code and decode future input signals. The training sequence is meant to statistically represent future sets of input vectors as accurately as possible. It will be appreciated then that each vector quantizer codebook is highly specialized for a particular type of signal set.
Typically, the design of a vector quantizer begins with the initialization of a set of output vectors that are based upon the training sequence of input vectors. A total distortion measure, generally a summation of the error or distance between each input vector in the training sequence and its representative output vector, is calculated. If the total distortion level is greater than a predetermined acceptable distortion threshold, the output vectors are readjusted in accordance with the training sequence of input vectors such that the total distortion is decreased. Alternatively, the number of output vectors may, be increased and all output vectors readjusted to lessen the total distortion. The readjustment process continues until the total distortion is at the acceptable level.
As the available number of output vectors increases, the accuracy of the representation will increase. However, as the number of output vectors increases, the corresponding data compression rate decreases. Consequently, the computational requirement during vector quantization is increased, due to the larger number of output vectors the vector quantizer must compare to each input vector. Thus, a constraint optimization problem is presented with the constraints including required data compression, acceptable distortion level, and computational limitations.
Another measure of the performance of a vector quantizer is its ability to handle "outliers". Outliers are the input vectors which are far from any clustered group of input vectors. Conventional vector quantizer design systems generally rely on the use of a centroid measure of the training sequence to determine or readjust the output vectors. For example, given a set of input vectors, an output vector will be set equal to the centroid of the group. The concept is that a centroid measure minimizes distortion for a given group. However, for certain distributions, a centroid measure may not accurately represent the population. For example, in a heavy-tailed distribution, the centroid measure produces an output vector approximation that does not accurately characterize the center ("location") of the distribution. Therefore, to produce a robust vector quantizer, one which can adequately handle outliers, a method other than centroid representation for determining the output vectors must be used.
The present invention employs a neural-network simulation method and apparatus for adjusting output vectors to produce a high data compression, low distortion robust vector quantizer.