Presently there is an increasing trend toward the development of data transmission networks which can incorporate digital voice, data, and image information on the same network. These are generally called ISDN or Integrated Services Digital Networks. Since these networks have a limited band width for transmission of information, the available band width must be used as efficiently as possible. For many years, the analysis, modeling, and coding of information to reduce the overall bit rate has been actively studied. In order to achieve the best results different techniques are applied to different types of information. "Vector quantization", is one specific technique that has applications in both voice compression and image compression. Image data is particularly challenging to compress because of the large amount of digital information required to accommodate the human eye. Another variation of vector quantization is often referred to as color/spatial quantization and has been developed to efficiently encode such images as color maps. Vector quantization is not limited to any specific type of data but has applications wherever there is redundant information that can be removed with a lossy compressor.
The primary goal of network services is to allow distributed processing and exchange of information in an environment in which central locations are responsible for maintaining data bases. Networks, such as the telephone system, have been developed for voice transmission. Other types of computer networks operate with digital data and with file transfers. Thus the need for special purpose networks and transmission links will continue to be a rapidly developing subject. The growing trend is to provide and operate integrated networks which carry digital voice data, information data, and image data. These integrated digital networks will provide the basis for efficiently exchanging information and maintaining data bases. Regardless of the size of the network or the type of the information that is being processed, there will always be a need for efficient storage and transmission. Thus the compression of voice data, information data, and image data will be a key technology for ISDN networks.
Many techniques have been developed for the compression of digital, voice, data and images. Each method takes advantage of specific characteristics of the data. Also consideration must be given to the purpose or final use of the data. For example, voice information and image information do not require perfect replication so long as the introduced distortion is not misleading or disturbing to the listener or observer. On the other hand, computer files that even have a single error are possibly no longer of adequate use. When a compression algorithm is able to restore the encoded data to its original form, with no degradation, the algorithm is referred to as a "lossless" data compression technique. Other algorithms that introduce an acceptable amount of distortion are referred to as "lossy" data compression techniques. Thus the requirements of the user will dictate which approach is best suited for compression of the data.
When a compression algorithm is chosen, the advantages of reduced storage and transmission charges must be compared to the cost and complexity of the implementation. Today's hardware operates at higher speeds, allows greater complexity, and is considerably less expensive than ever before. Thus these hardware advances allow complex algorithms to process data before and after storage and transmission. Special purpose hardware also allows algorithms to be directly implemented at reduced costs. Regardless of the approach, compression techniques are continually being reevaluated. However, many algorithms that were not feasible in the past are now realizable.
LOSSY COMPRESSION: Much of the data that is transmitted on an information channel is for use by the human sensory system. Minor alterations or infrequent errors to this data is undetectable or tolerable to human senses. Many compression techniques capitalize on this phenomena. When a technique is able to reduce the data rate and bandwidth required to send the information by controlling the distortion without intolerable changes to the data, it is referred to as lossy compression. Vector quantization is a lossy technique for reducing the amount of information to be transmitted or stored. This is accomplished by removing information that is perceived as useless in the particular application being considered. Presently a considerable amount of anticipation exists because of the gains that are being realized in the area of image compression by using vector quantization. A description of vector quantization and examples as applied to image compression follow.
VECTOR QUANTIZATION: Vector quantization is a technique for mapping vectors from a given vector space into a reduced set of vectors within the original vector space or some other representative vector space. The reduced set of vectors, along with the associated mapping, is chosen to minimize error according to some distortion measure. This representative set of vectors is referred to as a codebook and is stored in a memory table. Efficient transmission of vector quantized data occurs by sending a codebook index location from the memory table, rather than sending the vector itself. The computation required to compute the distortions, thereby finding the codebook entry of minimum distortion, has limited the availability of the technique. Advances in hardware allowing cost-efficient implementations of vector quantization have generated renewed systems of implementation during the last few years.
An optimal vector quantizer is designed around a probability distribution, placing the codevectors in the space according to vector probabilities. Vector probability distributions vary with different data. The LBG algorithm (discussed hereinbelow) uses either a known probability distribution, or trains the codevectors on a select set of training vectors. If the probability distribution is known, the codevectors are placed in the N dimensional vector space according to the probability distribution. Areas of high probability contain a larger population of codevectors; low probability areas contain a sparse population. If the probability distribution is not known, the codevectors are distributed according to a select set of training vectors. Iteratively selecting codevectors to minimize distortion results in a locally optimal set of codevectors. The algorithm guarantees convergence of a local minimum distortion, but not convergence to an absolute minimum for all vectors.
The vector quantization encoding process searches the representative codevectors and replaces the input vector from the data source with an index. The index represents the codebook vector of minimum distance from the incoming vector. Distance between vector and codevector is proportional to the amount of degradation that will occur from vector quantization. Distance is most often measured by using a squared error criterion but many others are discussed in the literature.
Recent developments in vector quantization have shown the technique to be useful for voice and for image compression. Because of advances in hardware which allow cost-efficient implementations, vector quantization methods have been expanded in development.
A fundamental result of rate distortion theory is that better overall compression performance can be achieved when encoding a vector (group of scalars) than when encoding the scalars individually. This development has been presented in an article by R. Gray entitled "Vector Quantization" in the IEEE ASSP magazine, of April 1984. Vector quantization takes advantage of this theory by compressing groups of scalars, and treating each scalar as a vector coefficient. As an image compression scheme, vector quantization has both theoretically and experimentally outperformed methods of image compression using scalar quantization.
Methods of compression attempt to remove redundancies, while causing minimal distortion. Vector quantization uses four properties of vector parameters for redundancy removal, namely: correlation; nonlinear dependency; probability density function; shape and vector dimension. Scalar quantization takes advantage of correlation and probability density function shape only. By using the properties of nonlinear dependencies and vector dimensionality, vector quantization is able to outperform scalar quantization even when compressing totally uncorrelated data and an optimal vector quantizer is designed around a probability distribution, placing the "code vectors" in the space according to vector probabilities. Vector probability distributions vary with different data. For example, in an article in the IEEE Transactions on Communications, January, 1980, entitled "An Algorithm for Vector Quantizer Design" by Y. Linde, R. Gray, and A. Buzo, there was developed an algorithm designated as the "LBG" algorithm which uses either a known probability distribution or trains the code vectors on a select set of training vectors. If the probability distribution is known, the code vectors are placed in the N dimensional vector space according to the probability distribution. Areas of high probability contain a larger population of code vectors; but low probability areas contain a sparse population. If the probability distribution is not known, the code vectors are distributed according to a select set of "training vectors". Iteratively selecting code vectors, that minimize distortion caused by encoding the training vectors, results in a "locally optimal set" of code vectors. The algorithm guarantees convergence of a local minimum distortion, but not convergence to an absolute minimum for all vectors of the training sequence.
The "vector quantization encoding process" operates to match a representative code vector with each input vector. The code vector that is the minimum distance from the incoming vector, is chosen as the representative code vector. The distance between the incoming vector and the code vector is proportional to the amount of degradation that will occur from vector quantization. This distance is measured by finding the Euclidian distance between the incoming vectors and code vectors. The Euclidian distances are then measured using a means squared error distortion formula as follows: ##EQU1##
Where x.sub.i is the image vector coefficient and Y.sub.i is the code vector coefficient. By minimizing the term d(x,y) over all of the code vectors, this will cause the selection of the "closest" code vector, and thus gives the best possible match between the incoming vector and the code vector.
By representing this larger set of incoming vectors with a smaller subset of code vectors, enables a reduction in the amount of information required. The rate of compression realized is a function of the vector dimension X,.sup.r the code vector subset size L,(2.sup.k) and the scalar size k(2.sup.8).