In low-rate digital video systems such as that for videoconferencing, DPCM is often employed to remove redundant information from the image. In DPCM, a prediction of the current (incoming) pixel is formed from other pixels. The prediction is then subtracted from the current pixel to form the difference pixel. The resultant difference image is then quantized and encoded for digital transmission at the lower data rate. There are numerous prediction schemes that have been suggested and implemented. Among them, the common ones are the temporal prediction and the spatial prediction. A temporal predictor is one in which the prediction is formed using pixels from the previous frame. A spatial predictor, on the other hand, uses only pixels from the current frame to form the prediction of the current pixel.
A transform coded system is described in U.S. Pat. No. 4,302,775 issued Nov. 24, 1981 to Widergren et al., in which the transform coefficients are compressed for each transform block depending upon the fill condition of a buffer for that block.
"Pixel" is often used to denote either the data word representing a pixel or the value of the data word by which a pixel is represented. The transmitter knows the value of the predicted pixel in a predictive coding system, because the transmitter includes a prediction circuit which is identical to the prediction circuit in the receiver which is producing the predicted pixel. The predicted pixel is often a corresponding pixel from a previous frame, or a weighted linear combination of pixels lying near the corresponding pixel of either the current or the previous frame. In this context, "near" means close physical proximity in the two-dimensional picture or raster of which the pixels are a part.
FIG. 1 illustrates a communication system using prior art DPCM predictive coding techniques. In FIG. 1, a transmitter 10 communicates by way of a narrow bandwidth data channel 30 with a receiver 38. Transmitter 10 receives a source 12 of frame-sequential, line-scanned analog television signals to be transmitted. The analog television signals are applied to an analog-to-digital converter (ADC) 14 in transmitter 10. ADC 14 samples the analog signals, quantizes them (represents the infinite range of values by a finite set of values), ordinarily into 256 levels, which is a fine enough division so that the eye cannot perceive the error in a displayed picture, and digitizes the finely quantized signals (represents each value of the set by a different digital number) to form finely quantized digital signals.
The digital signals produced by ADC 14 are applied to the noninverting (+) input of subtractor 16 which receives a predicted signal at its inverting (-) input terminal. The predicted signal applied to the inverting input terminal of the subtractor 16 is subtracted from the current value of the signal then being applied to the noninverting input terminal of adder 16 from conductor 15. A difference signal is generated at the output of subtractor 16. The difference signal is often known as an error signal. Since ADC 14 finely quantized the source signal, the difference signal at the output of subtractor 16 is also finely quantized.
While not essential to operation of predictive coding systems, a coarse quantizer 18 is often inserted after the output of subtractor 16 to coarsely quantize the difference signal into a number of "bins", which aids in reducing the data rate by restricting the number of possible levels transmitted over the channel to the receiver, and by increasing the length of zero runs. The bin is itself represented by a digital number, so the output of quantizer 18 is a quantized difference signal. Thus, the difference signal at the input of coarse quantizer 18 is finely quantized, while the output signal is coarsely quantized, if the degree of quantization is not specified.
The coarsely quantized error signal is applied to a predictor loop 20. A circuit 40 in the receiver 38 is a replica of loop 20 and regenerates each pixel to be displayed in succession from the signal transmitted over channel 30. Predictor loop 20 includes an adder 22 which adds to the error signal from quantizer 18 the delayed value of the predicted signal received from predictor and delay circuit 24. This produces a new predicted signal which is applied to circuit 24. Predictor and delay circuit 24 delays the new predicted signal for a predetermined length of time, and may perform other processing steps, The delay associated with predictor and delay circuit 24 may be one frame interval. With a delay duration of one frame interval, the intensity value of a pixel of a frame is generally expected to be the same as the value of the corresponding pixel of the preceding frame. For a still picture, this will be true for every pixel, and in a picture having some motion, it will be true for many pixels. The delayed predicted signal is also applied to the inverting input terminal of subtractor 16.
The coarse quantizer 18 causes all values of the finely quantized difference signal applied thereto to be coarsely quantized into "bins". That is, all finely quantized difference signals lying within a range of value or "bin" are processed by coarse quantizer 18 to produce a single coarsely quantized value. In effect, all values of finely quantized difference signals lying within the range of values defining a bin are treated as though they had only one value, namely the bin value. The bin value may be near the center of the range of finely quantized values defining the bin. The number of non-zero bins is often a power of 2 such as 16 or 32. In addition to these non-zero bins, there is a center or "coring" bin into which finely quantized difference signals having zero or near zero magnitude will fall, and which produce a coarsely quantized value of zero at the output of the quantizer. The designer expects that a large number, or all, of the finely quantized difference pixels will fall into this coring bin. When the finely quantized differences are at or near zero, this indicates that the predicted signal is a faithful representation of the image or picture currently to be transmitted.
For television signals, the arithmetic value of a pixel is represented by one of 2.sup.8 or 256 possible numbers, each separate possible value corresponding to a unique distribution of logic highs and logic lows on eight conductors. For example, the 8-bit digital word 10010010 represents the arithmetic value 146 which in turn represents a luminance for that pixel which is 146/256 of the maximum luminance. The result of the subtraction process is such that the resulting signal is represented by 9-bits, or 512 possible numbers. The coarse quantizer reduces the number of possible levels of the difference signal to a smaller value than 512, as for example 16 non-zero bins plus the coring bin.
Thus, all finely quantized difference pixel arithmetic values ranging from (for example) 0 to 5 are forced by the quantizer to fall into the coring bin (zero value of the coarsely quantized signal), and values ranging from 6 to 10 are forced into bin 1. Finely quantized difference pixel arithmetic values of negative sign are also assigned to bins of the coarsely quantized difference signal. Assuming there are only 16 non-zero bins in this example, they can be represented by as few as 4 conductors (4-bits). If 4-bit digital numbers are used to represent bins, the bin numbers are not actual arithmetic values, and cannot be applied directly to adder 22. The desire to process relatively small 4-bit "bin" numbers, coupled with the need to apply proper arithmetic values often results in an arrangement (not illustrated) in which the coarse quantizer has two output conductor sets, one set coupled to the circuit adder which has a large number of conductors or bits (such as nine) for representing actual arithmetic values, and the other set having fewer conductors or bits (3 or 4) for coupling the corresponding "bin" number to a "coder". The close relationship between the arithmetic value and the bin value is known to those in the art and described, for example, in U.S. Pat. No. 3,761,613 issued Sept. 25, 1973 to Limb. The description hereinafter assumes that the exemplary quantizer has a single output which produces arithmetic values in the form of parallel 9-bit difference signals, which are used by both the adder and the coder.
The coarsely quantized difference signal is applied to coder 26. Coder 26 encodes the coarsely quantized difference signals on conductor 19 in known fashion, as by run length coding by statistical coding such as Huffman coding or the like, or by run-length coding in combination with statistical coding. Run-length coding drastically reduces the number of bits which are required to be transmitted over the data channel by counting the number of successive pixels from the coarse quantizer 18 which are at zero value (which are in the coring bin). In the above example, finely quantized difference signal pixel amplitudes or values from 0 to 5 are in the first or coring bin, and are assumed to be near enough to the predicted value so that the coarsely quantized difference signal can be zero. If the prediction and delay circuit of the system is effective, and especially if there is little motion in the television scene, it produces a signal which is very similar to the signal currently to be transmitted, so both the finely and coarsely quantized differences are mostly zero. If there are long runs (for example, A pixels in length) of zero-value coarsely quantized differences, the run of A pixels can be represented by a single codeword which means "the current image is the same as the predicted image for these A pixels38 . Thus, one codeword of, say, 20 bits, can represent any number of 9-bit pixel difference values. If the run length is for example 100 pixels, the amount of data required to be transmitted to represent the image is reduced from 900 bits (9 bits per pixel X 100 pixels) to 20 bits (the number of bits in a representative maximum-length codeword). In addition to signals representing zero run lengths, signals representing the amplitudes of at least some coarsely quantized difference pixels must be sent over channel 30 to the receiver. Such amplitudes are often coded by Huffman coding, in which the frequency of occurrence of various amplitudes or bins is evaluated, and codewords are assigned to each amplitude, with the codewords being shorter for the more frequently-occurring values and longer for infrequently-occurring amplitudes.
The rate of generation of codewords is highly variable and depends upon the picture which is represented by the coarsely quantized difference signals being coded. In a still image, there will be very long runs of zero difference signals. Each very long run of zero values can be represented by a single codeword, which codeword can only be generated at the end of the run. If the image is highly variable, as when a transition occurs between two very different scenes, for example, where there is violent motion in a scene, or when the camera pans or zooms rapidly, there will be few long runs of zeroes, and many unlike amplitude values will occur, requiring frequent generation of relatively long Huffman codewords
In order to eliminate the variability of the data rate, a rate buffer 28 is coupled to coder 26 for receiving or being laden (loaded) with coded difference data at a variable rate, for temporarily storing the coded difference data, and for applying the coded difference data at a constant rate through the channel 30 to a receiver 38. This type of buffer is often known as a first-in, first-out (FIFO) memory.
Receiver 38 receives coded difference data at a constant rate from channel 30, and stores the coded difference data in a rate buffer 48. Data is supplied therefrom as required to a decoder 46, which accepts the run length and Huffman-coded difference data at a variable rate, and decodes it into difference signals exactly corresponding to the coarsely quantized difference signals which were available at the transmitter 10 (except for transmission errors, which are not considered herein). The decoded difference signals are applied to an input terminal of an adder 42 of a predictor loop 40. Adder 42 adds together the decoded difference signal and the delayed predicted signal to produce a new predicted signal which is applied to a digital-to-analog converter (DAC) 54 for generating an analog signal, which is applied to a television display circuit 52 for display of the picture. The new predicted signal is also applied to predictor and delay circuit 44 which is identical to predictor and delay circuit 24 of transmitter 10. Since predictor and delay circuit 44 is identical to predictor and delay circuit 24, the new predicted signal on conductor 43 appears on conductor 45 after a corresponding delay, which in the example is one frame interval. The coarsely quantized difference signal of transmitter 10 and the signal on conductor 59 of receiver 38 are identical (except for a time lag due to the time required for transmission therebetween).
Predictive systems such as that illustrated in FIG. 1 can achieve very large reductions in data rate, especially on still pictures. However, when the picture has motion, the predicted signal may at times be most unlike the actual current value. When there is substantial motion in the television picture, the error signals tend to be large in value and to change rapidly. Run length coding tends to be relatively less effective in reducing data rate, and Huffman coding tends to produce relatively longer code words. Since the data rate of channel 30 is preestabished and rate buffer 28 of transmitter 10 can only transmit data at the maximum rate allowed by channel 30, it is possible for rate buffer 28 to become overfull or to "overflow" when the average size of the code word length is large, and code words are applied to the rate buffer for a long period of time at a high rate. The terms "overfull" and "overflow," may not be sufficiently descriptive. The rate buffer is "laden" or loaded by the difference between the variable flow of code words into the buffer and the fixed flow of code words out of the buffer, which forms a "lading" or loading which varies with time. The lading may, from moment to moment, vary from zero (empty buffer) to the maximum capacity of the buffer (corresponding to a full buffer). Any attempt to further increase the lading beyond the maximum capacity, even by one word, creates an "overflow" condition. "Underflow" occurs when the buffer writes or attempts to write to the outside world a number of bits which exceeds the number of bits in the lading, with the result that meaningless zero values are transmitted as meaningful data. When the lading is such that underflow or overflow occurs, some code words may not be stored in the rate buffer 28, or are corrupted, and are therefore lost. The loss of code words is very serious in a predictive encoding type of communication system, and leads to substantial errors (mistracking) in data transmission and consequent distortions of the transmitted picture.
It should be noted that the coarse quantizer (18) in these loops is a nonlinear element, which makes rigorous analysis difficult. Furthermore, the quantizer may have quantizing steps of different sizes as described below, which increases the nonlinearity. However, ignoring the nonlinearity in the analysis produces results which, while not rigorous, indicate trends, and which can therefore be useful.
A known method for stabilizing the lading of the rate buffer (and therefore preventing exceeding the capacity of the buffer by underflow or overflow) is to sense the occupancy or the amount of lading of the rate buffer. A control signal generated in response to the sensed occupancy is applied to at least one of the elements of the predictive coding system which produces the coded difference signal to reduce the rate of generation of the code words when the control signal indicates that the buffer is above or below a certain lading level.
U.S. Pat. No. 4,691,233 and U.S. Pat. No. 4,700,226 issued to A. A. Acampora, describe adaptive control of filters for reducing image resolution, and decimators and interpolators for reducing data rate, both under the control of the fill or occupancy of the rate buffer, for the purpose of prevention of overflow of the rate buffer at the transmitter. U.S. Pat. No. 4,093,962 issued June 6, 1978, to Ishiguro et al., describes adaptive control of the amplitude of the difference signal in response to rate buffer occupancy. U.S. Pat. No. 4,200,866 describes a method employing an adjustable quantizer which is switched between different quantizing characteristics.
When a still image has been transmitted for a substantial time, the difference signals tend towards zero, and the encoding becomes very efficient. This efficiency results in transmission of relatively short codewords at infrequent intervals. If the scene of the image changes drastically and thereafter contains motion, large numbers of relatively large codewords are generated thus, the rate buffer 38 receives, on average at this time, long codewords which represent few pixels. An unobvious problem results. The bits of the codewords must be sent from rate buffer 28 over channel 30 serially. Thus, each long codeword, for example, 20-bits in length, may take as long as 20 modem or channel clock intervals to be transmitted through the channel and loaded into rate buffer 48 at the receiver. The receiver rate buffer 48 must form the serially received long codewords into parallel format, and supply them to its decoder 46 as demanded by decoder 46. The demand is at the pixel clock rate, which is much higher than the channel clock rate. The decoder 46 must decode codewords supplied to it from its buffer 48. The receiver rate buffer 48 may run out of codewords or underflow, and thereby not be able to supply codewords to decoder 46 fast enough to keep up with the demand. Parallel processing cannot shorten this time, because the bits of each codeword are received at receiver 38 and its rate buffer 48 sequentially, and the codeword cannot be supplied by the rate buffer 48 to the decoder 46 until the bits have all arrived. Thus, the long codewords which require a long transmission time (up to 20 channel clock intervals), tend to be read out of the buffer very quickly. Because of their length, the long codewords tend to occupy more buffer space than short codewords. Even if buffer 48 is relatively full of these long codewords, they tend to be read out so quickly that the buffer occupancy falls during the interval in which they are read out. If the rate buffer does not contain a sufficient backlog lading, the reading of the large codewords will deplete its contents to an empty condition, and then either attempt to underflow, causing errors, or be unable to supply the next pixel, which also results in errors. Since the receiver 38 must generate pictures by sequentially producing pixels at the clock rate, the rate buffer 48 must always have a codeword available on demand from the decoder when it is needed to produce the next pixel. If rate buffer 48 is empty at the moment of demand by decoder 46, a codeword may not become available for up to another 20 clock (pixel) intervals, and even when it completes its arrival at buffer 48 and is supplied to decoder 46, it may only satisfy the demand for one pixel or clock interval. Thus, the long codewords which occur during rapid motion of the image represented by the scene being televised may result in receiver 38 being unable to replicate in predictor and delay circuit 44 the signal stored in predictor and delay circuit 24 of the transmitter, because pixels have been corrupted or missed. It is very undesirable to allow the receiver to corrupt or to miss pixels, because in that event the predictors at the transmitter and receiver do not contain the same information, and the image displayed at the receiver will thereafter not correspond to the image being transmitted.
The above-mentioned problem can be solved by keeping a long backlog, for example many frames, stored in the receiver rate buffer 48. However, there is roughly a two-second inherent delay in such a system which is attributable to the passage of the signal through four rate buffers, even for a relatively moderate 10-frame storage in the rate buffers. In a conversational context, this is a very undesirable delay.
There is an additional unobvious problem which can occur even when there is a substantial backlog of information in the receiver rate buffer. This problem is related to the speed with which the receiver decoder 46 can decode long codewords. In principle, the decoder can operate at the pixel clock rate. However, for images with violent motion, the decoder may lag far behind the decoder buffer as discussed above, to the point that the decoder buffer may underflow or be unable to supply difference signals at the pixel rate, which results in erroneous displayed pictures. This problem can also be ameliorated by keeping a large backlog in the decoder buffer, but adds to the conversational delay.
Thus, there is a conflict between the need for short delays and the need to keep the receiver and/or coder buffer from running out of pixels. It is highly desirable to minimize the number of frames stored in each rate buffer in order to minimize the overall system delay. Thus, it is very undesirable to ameliorate the problem of running out of pixels in the rate buffer by increasing the average number of frames stored in the rate buffers.
Copending U.S. Pat. application Ser. No. 928,042 filed Nov. 1, 1986 in the names of N. Fedele and A. Acampora entitled "DPCM System with Rate of Fill Control of Buffer Occupancy", describes a DPCM system in which a control signal is generated which represents at least in part, the rate at which the transmitter rate buffer (e.g. buffer 28 of FIG. 1) is filling, and which controls the processing when the rate of fill exceeds certain thresholds to reduce the number of codewords required to be stored in the rate buffer for transmission over the limited bandwidth channel to the receiver. The processing can be controlled to produce fewer codewords by providing increased temporal decimation or spatial filtering. U.S. Pat. No. 4,023,199 issued May 10, 1977 to Netravali et al. describes a DPCM system in which both fine and coarse quantization laws are used to quantize signals for transmission.
U.S. Pat. 4,583,114 issued Apr. 15, 1986 to Catros describes a DPCM system in which the receiver receives information relating to which of a plurality of quantizers is being used at the transmitter. In order to control the rate of fill of the buffer, it is desirable to initiate the reduction of codewords as soon as possible after it becomes evident that the rate of generating must be reduced. Temporal decimation must be entered into only at frame sync, to prevent initiation of decimation half-way down the raster. It is very desirable to avoid the need to send a codeword to the receiver when the buffer is too full or filling too fast to represent a change in the signal processing because the codeword itself occurs relatively infrequently and would be Huffman coded as a relatively long codeword, and the long codeword occurs just at the wrong time, i.e. when the channel transmission tends to be lagging behind the generating (at the transmitter) and use (at the receiver) of codewords. Furthermore, the use of a plurality of coders and multiplex switches is costly and reduces system reliability. Thus, the above-mentioned Catros system is disadvantageous.
A need is seen, therefore, for a system which will more efficiently process data to the receiver method without adding additional codewords to the transmission. Because DPCM systems use a predictive loop in the receiver which is required to accurately recreate the transmitted information based on adjacent pixel data, it is important that data processing in the transmitter can be faithfully reproduced at the receiver subject to acceptable errors. Thus, systems which attempt to reduce the data in a way which requires transmitted control words to permit the receiver to effectively operate actually increase the data rate and, therefore, are not desirable. Further quantizers whose ensemble of output words changes increases the complexity of the system because the receiver needs to be able to respond to the change in the ensemble of output words. It is, therefore, important that the quantizer output words remain unchanged. This presents a problem in respect of changing the characteristics of a quantizer without changing the output words.
In U.S. Pat. No. 4,093,962, a scheme is described in which the predictive error signal is multiplied by a controllable factor before being quantized into information codes. The factor is determined in response to a buffer status signal for varying the amplitude of the predictive error signal relative to the quantization levels of the quantizer. However, this suffers the disadvantages in that where the ensemble of words of the quantizer changes added complexity to the receiver is required in order to respond to such changes as already discussed.