The background of the present invention is described herein in the context of pay television systems, such as cable television systems or direct broadcast satellite (DBS) systems, that distribute program material to subscribers, but the invention is by no means limited thereto except as expressly set forth in the accompanying claims.
In a typical cable television system, cable television operators receive much of their program material from remote earth station transmitters via a plurality of geosynchronous orbit satellites. The cable operator selects the program material to be made available to its subscribers by making arrangements with the satellite distributors of that program material The cable operator receives the transmitted program material at its "cable head-end," where it then re-transmits the data to individual subscribers. Frequently, cable operators also provide their own local programming at the site of the head-end, and further include network broadcasts as well.
In a DBS system, individual subscribers are provided with their own satellite receiver. Each subscriber establishes a down-link which receives the signal broadcast by the satellite directly. Thus, there is no need, as with cable systems, for re-transmission from a cable head-end.
Typically, in both types of systems (cable and DBS), the program material (both video and audio) is originally in analog form. Conventional transmission techniques place substantial limitations on the maximum number of viewer channels that can be transmitted over any given transponder on a satellite since each channel requires a minimum bandwidth to avoid noticeable degradation and the total number of channels that can be transmitted over a given satellite transponder is limited by its bandwidth and the bandwidth of each signal. Also, in cable systems, the electrical properties of the coaxial cable and associated amplifiers limit its bandwidth and therefore place substantial limitations on the number of channels that can be delivered to cable television subscribers using conventional transmission techniques.
As a result of the desire to provide more program channels to subscribers over existing distribution channels, the pay television industry has begun to investigate digital image transmission techniques. Although the desire is to minimize the transmission bandwidth of program material, thus allowing more channels to be transmitted over an existing broadcast channel, digital image transmission further offers the advantage that digital data can be processed at both the transmission and reception ends to improve picture quality. Unfortunately, the process of converting the program material from analog form to digital form results in data expansion which increases the transmission bandwidth of the program material rather than decreasing it. Therefore, digital transmission alone does not solve the bandwidth problem, but instead makes it worse. However, through the application of digital data compression techniques, large bandwidth reductions can be achieved.
Data compression techniques minimize the quantity of data required to represent each image. Thus, more program material, or more channels, can be offered over an existing broadcast channel. However, any data compression achieved is offset by the data expansion which occurs during the analog to digital conversion. Therefore, to be practical, the compression technique employed must achieve a compression ratio large enough to provide a net data compression. Digital data compression techniques, such as Huffman encoding and LZW (Lempel, Ziv and Welch) encoding, offer, at best, compression ratios of 2.5 to 1 and do not sufficiently compensate for the amount of data expansion that occurs in converting data from analog to digital form.
In response to the need for large compression ratios, a number of so-called "lossy" compression techniques have been investigated for digital image compression. Unlike the Huffman and LZW encoding techniques, these "lossy" compression techniques do not provide exact reproduction of the data upon decompression. Thus, some degree of information is lost; hence the label "lossy." One such "lossy" compression technique is called DCT (discrete cosine transform) data compression. Another method, which, until recently, has been used principally for speech compression, is vector quantization. Vector quantization has shown promise in image compression applications by offering high image compression rates, while also achieving high fidelity image reproduction at the receiving end. It has been demonstrated, for example, that using vector quantization (hereinafter sometimes referred to as "VQ"), compression rates as high as 25:1, and even as high as 50:1, can be realized without significant visually perceptible degradation in image reproduction.
Compression of video images by vector quantization involves dividing the pixels of each image frame into smaller blocks of pixels, or sub-images, and defining a "vector" from relevant data (such as intensity and/or color) reported by each pixel in the sub-image. The vector (sometimes called an "image vector") is really nothing more than a matrix of values (intensity and/or color) reported by each pixel in the sub-image. For example, a black and white image of a house might be defined by a 600.times.600 pixel image, and a 6.times.4 rectangular patch of pixels, representing, for example, a shadow, or part of a roof line against a light background, might form the sub-image from which the vector is constructed. The vector itself might be defined by a plurality of gray scale values representing the intensity reported by each pixel. While a black and white image serves as an example here, vectors might also be formed from red, green, or blue levels of a color image, or from the Y, I and Q components of a color image, or from transform coefficients of an image signal.
Numerous methods exist for manipulating the block, or sub-image, to form a vector. R. M. Gray, "Vector Quantization", IEEE ASSP Mag., pp. 4-29 (April, 1984), describes formation of vectors for monochrome images. E. B. Hilbert, "Cluster Compression Algorithm: A Joint Clustering/Data Compression Concept", Jet Propulsion Laboratory, Pasadena, CA, Publ. 77-43, describes formation of vectors from the color components of pixels. A. Gersho and B. Ramamurthi, "Image Coding Using Vector Quantization", Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 428-431 (May, 1982), describes vector formation from the intensity values of spatially contiguous groups of pixels All of the foregoing references are incorporated herein by reference.
By way of example, a television camera might generate an analog video signal in a raster scan format having 600 scan lines per frame. An analog to digital converter could then digitize the video signal at a sampling rate of 600 samples per scan line, each sample being a pixel. Digital signal processing equipment could then store the digital samples in a 600.times.600 pixel matrix. The 600.times.600 pixel matrix could then be organized into smaller blocks, for example 6.times.4 pixel blocks, and then each block could be converted into a vector.
In an image vector quantizer, a vector quantization "codebook" is created from training data comprising a representative sample of images which the quantizer is likely to encounter during use. The codebook consists of a memory containing a set of stored "codevectors," each representative of commonly encountered image vectors. For example, one codevector might be a 6.times.4 pixel solid black patch. Another codevector might have all white pixels in the top three rows, and all black pixels in the bottom three rows. Yet another codevector might have a gradient made up of white pixels in the top row, black pixels in the bottom row, and four rows of pixels in between having shades of gray from light to dark. Typically, a codebook of representative codevectors is generated using an iterative clustering algorithm, such as described in S. P. Lloyd, "Least Squares Optimization in PCM", Bell Lab. Tech. Note, (1957) (also found in IEEE Trans. Inform. Theory, Vol. IT-28, pp. 129-137, March (1982); and, J. T. Tou and R. C. Gonzalez, "Pattern Recognition Principles", pp. 94-109, Addison-Wesley, Reading, Mass. (1974). Both of these references are incorporated herein by reference.
Each codevector in the codebook is assigned a unique identification code, sometimes called a label. In practice, the identification codes, or labels, are the memory addresses of the codevectors. For each input image vector, data compression is achieved by selecting the codevector in the codebook that most closely matches the input image vector, and then transmitting the codebook address of the selected codevector rather than the input image vector itself. Compression results because the addresses of the selected codevectors are much smaller than the image vectors. At the receiving end, an identical codebook is provided. Data recovery is achieved by accessing the receiver codebook with the transmitted address to obtain the selected codevector. Because the selected codevector closely resembles the original input vector, the input vector is substantially reproduced at the receiver. The reproduced input vector can then be converted back to the block of pixels that it represents. Thus, in this manner, an entire image can be reconstructed at the receiver.
Some distortion of the original image does result, however, due to inexact matches between the input vectors and the selected codevectors. Remember, the codevectors in the codebook are only a representative sample of possible input vectors, and therefore, exact matches rarely occur during actual use of the quantizer. Increasing the size of the codebook used for compression and decompression generally decreases the distortion.
Many different techniques for searching a codebook to find the codevector that best matches the image vector have been proposed, but generally the methods can be classified as either a full search technique, or a branching (or tree) search technique. In a full search technique, the vector quantizer sequentially compares an input image vector to each and every codevector in the codebook. The vector quantizer computes a measure of distortion for each codevector and selects the one having the smallest distortion. The full search technique ensures selection of the best match, but involves the maximum number Of computational steps. Thus, while distortion can be minimized using a full search technique, it is computationally expensive. Y. Linde, A. Buzo and R. Gray, "An Algorithm For Vector Quantizer Design", IEEE Transactions on Communications, Vol. COM-28, No. 1 (January 1980), incorporated herein by reference, describes the full search technique and the computational steps involved in such a search. The full search technique is sometimes called "full search vector quantization" or "full search VQ".
The tree search technique can be thought of as one that searches a sequence of small codebooks, instead of one large codebook. The codebook structure can be depicted as a tree, and each search and decision corresponds to advancing one level or stage in the tree, starting from the root of the tree. Thus, the input vector is not compared to all the codevectors in the codebook, as with the full search technique. Consequently, the tree search technique reduces the number of codevectors that must be evaluated (and thus reduces search time). However, the more limited search generally does not guarantee selection of the optimum codevector. Therefore, a tree search vector quantizer requires a larger codebook memory to achieve the same level of distortion as the full-search technique. A detailed description of the tree search technique may be found in R. M. Gray and H. Abut, "Full Search and Tree Searched Vector Quantization of Speech Waveforms," Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 593-96 (May 1982), and R. M. Gray and Y. Linde, "Vector Quantization and Predictive Quantizers For Gauss Markov Sources", IEEE Trans. Comm., Vol. COM-30, pp. 381-389 (February 1982), both of which are incorporated herein by reference. The tree search technique is sometimes referred to as "tree-search vector quantization", "tree-search VQ" and "TSVQ." This technique has found favor for compressing dynamic images, because it is computationally faster than the full search technique. However, as mentioned, tree-search VQ does not guarantee selection of the optimum codevector, and therefore, a larger codebook memory is required to achieve a given level of distortion than is required for full search VQ.
In a pay television system, the program material is typically compressed at a transmitter location and transmitted to cable operators via satellite. In digital TV distribution systems, cable operators may simply receive the compressed data from the satellite and retransmit the data via a cable television network to individual cable subscribers. Alternatively, cable operators may choose to decompress the data at the cable head-end and send the program material to individual subscribers in analog form. The device that performs vector quantization and compression at the transmitter is called an encoder, and the device that performs decompression and image reproduction at the receiving end is called a decoder.
Where the cable operator simply retransmits the compressed data to the cable subscriber (i.e. the cable operator does not decompress the data), the compressed nature of the data allows the cable operator to offer more channels over the cable distribution network. However, each television subscriber must have a vector quantization decoder in his home to decode (decompress) the program material for display on a television set, or for recording on a VCR. The large memory required for a tree-search vector quantization codebook can be prohibitive from a cost standpoint. DBS subscribers who receive the compressed data via satellite directly, face the same problem, since they too must have VQ decoders near their television sets. Cable operators who decompress the received data at the cable head-end, also may not wish to invest in a decoder having a large codebook memory. Also, cable operators who convert the compressed signal to NTSC at the cable head-end need higher picture quality than subscribers, since the final images received by the subscriber will have the picture impairments of both VQ (to the head-end) and the present day cable impairments of NTSC (noise and distortion). If cable operators were to pass the compressed digital signal through the cable system, the subscriber would not have the present day NTSC impairments of the cable system. Consequently, there is a need for a system that can deliver subscriber quality image data to a low cost decoder at the subscriber location while also delivering higher quality to the cable head-end where the cost of the decoder is less significant.
In other words, because of the prohibitive cost of a large VQ codebook memory in a subscriber decoder, there exists a need for a pay television system wherein the transmitter location employs a large VQ codebook for encoding the image data, but wherein individual subscribers can employ codebooks of differing memory sizes depending on the picture quality desired and their ability to pay. As mentioned above, decreasing the size of the VQ codebook at the receiver location will decrease the quality of the reproduced image. However, the decrease in cost offsets the decrease in picture quality. As the ability of a subscriber to pay increases, it would be desireable for the subscriber to be able to increase the image quality simply by replacing the existing decoder with a decoder having a larger codevector memory.
At the same time, there is a need for a vector quantization method which results in a reduction in the amount of memory needed for vector quantization while maintaining at least the same level of image reproduction fidelity. Also, due to the ever increasing demand for more program channels, there is a need for a method of still further increasing compression ratios available with tree search vector quantization. The present invention satisfies these needs.