The transmission of television signals in our society is widespread. The type of television transmission most familiar to the public is conventional broadcast television which occurs on VHF and UHF television channels. These television channels each have an assigned bandwidth of 6 megahertz (MHz). In some areas of the country, it would be desirable to have additional channel capacity available, as by the use of narrower channel bandwidth. While it is technologically feasible to significantly reduce the bandwidth required for conventional television broadcasting by modern coding methods, the enormous cost of changing millions of television receivers to accommodate this coding is prohibitive.
In addition to terrestrial broadcast, there are many other uses of broadcast or point-to-point transmitted television in our society. For example, international satellite television links transmit live programs around the world, television networks distribute network programming to their affiliates, and weather and earth-resource satellites transmit television signals representing their pictures. Furthermore, video teleconferencing and facsimile transmission of newspapers and printed material is receiving increasing attention. In many of these applications, it is highly desirable to reduce the required transmission bandwidth or data rate to the minimum possible, in order that a satellite or other transmission link may carry the maximum number of individual television pictures. A large body of art has arisen which is directed toward schemes for coding television signals in various manners to take advantage of the redundancy of the television signals for data rate reduction, as described for example in the article "Picture Coding: A Review" by Netravali et al., published at pages 366-406 of the proceedings of the IEEE, Volume 68, No. 3, March 1980.
According to Netravali, in addition to pulse code modulation (PCM, often known simply as digital television signals), coding is classified in the major categories of (a) transform coding, (b) interpolate/extrapolate coding, (c) predictive coding and (d) miscellaneous coding. Pulse code modulation merely transforms the television signal into a digital signal, which, in general, is not a bandwidth efficient code. Transform coding breaks the television signal into blocks of data which may be considered to be subpictures, and represents the subpictures as linear combinations of certain standard sub-pictures. The proportion of each standard picture is termed a coefficient. The interpolate/extrapolate coder attempts to send certain samples to the receiver and to either interpolate or extrapolate the remainder of the samples. The miscellaneous schemes include conditional replenishment, in which individual line element sample signals from a field of information are compared with the corresponding line elements in the previous field, and the difference therebetween is tested against a fixed threshold. If the difference exceeds the threshold value the new value is encoded and transmitted to a receiving station, along with an appropriate address code, as described in U.S. Pat. No. 4,541,012 issued Sep. 10, 1985, to Tescher. In general, conditional replenishment techniques are not optimum because the addresses of the transmitted samples must be transmitted.
The predictive coding technique is effective for reducing the data rate. In predictive coding, the transmitter generates a difference or error signal for transmission to the receiver which represents the difference between a current data word representing a picture element (pixel or pel) which the transmitter is currently receiving and a reference or "predicted" data word representing a pixel which is generated by the receiver. It should be noted that the word "pixel" is often used to denote either the data word representing a pixel or the value of the data word by which a pixel is represented. The transmitter knows the value of the predicted data word or pixel in a predictive coding system, because the transmitter includes a prediction circuit which is identical to the prediction circuit in the receiver which is producing the predicted pixel. The predicted reference pixel is often a corresponding pixel from a previous frame, or a weighted linear combination of pixels lying near the corresponding pixel of either the current or the previous frame. In this context, "near" means close physical proximity in the two-dimensional picture or raster of which the pixels are a part.
FIG. 1 illustrates in block diagram form a communication system using prior art predictive coding techniques. In FIG. 1, a transmitter 10 communicates by way of a narrow bandwidth data channel 30 with a receiver 38. Transmitter 10 is coupled to a source 12 of frame-sequential, line-scanned analog television signals which applies the analog television signals to an analog-to-digital converter (ADC) 14 in transmitter 10. ADC 14 samples the analog signals, quantizes them (represents the infinite range of values by a finite set of values) and digitizes them (represents each value of the set by a different digital number) to form digital signals which are made available on a conductor 15. Those skilled in the art understand that digital signals may be in either serial or parallel form, and that serial digital signals may be carried on a single conductor (together with its associated ground), while parallel signals must be carried by a set (a plurality) of conductors. Since this is well known, no distinction is made hereinafter between single conductors and sets of conductors, unless relevant to the discussion. The digital signals produced by ADC 14 on conductor 15 are applied to the noninverting (+) input terminal of a subtracting circuit or subtractor 16 which receives a predicted signal from conductor 25 at its inverting (-) input terminal. The predicted signal applied to the inverting input terminal of subtractor 16 is subtracted from the current value of the signal then being applied to the noninverting input terminal of adder 16 from conductor 15. A difference signal is generated at the output of subtractor 16. The difference signal is often known as an error signal. Since ADC 14 quantized the signal, the error signal at the output of subtractor 16 is also quantized. While not absolutely necessary to an understanding of predictive coding systems and not essential to operation of predictive coding systems, a coarse quantizer illustrated as a block 18 is often coupled to the output of subtractor 16 to coarsely quantize the difference signal into a number of "bins". The bin is itself represented by a digital number, so the output of quantizer 18 on conductor 19 is a quantized difference signal, just as is the signal on conductor 17. The term difference (or error) signal hereinafter refers to the difference (error) signal on either conductors 17 or 19, without regard to the magnitude of the quantizing steps.
The difference or error signal on conductor 19 is applied to a predictor loop designated generally as 20. Predictor loop 20 is a replica of the circuit 40 in receiver 38 which regenerates each pixel to be displayed in succession from the signal transmitted over channel 30. Predictor loop 20 includes a summer or adder 22 which receives the difference or error signal from conductor 19. Adder 22 adds to the difference or error signal the delayed value of the predicted signal received from conductor 25 to produce a current or new predicted signal which is coupled by a conductor 23 to a predictor and delay circuit 24. Predictor and delay circuit 24 delays the new predicted signal for a predetermined length of time, and may perform other processing steps, as mentioned, such as averaging together nearby pixels. For example, the delay associated with predictor and delay circuit 24 may be one frame interval. A delay magnitude of one frame interval indicates that the intensity value of a pixel of a frame is generally expected to be the same as the value of the corresponding pixel of the preceding frame. For a still picture, this will be true for every pixel. Even in a picture having some motion, it will be true for many pixels. The new value of the predicted signal appearing on conductor 23 is a current predicted signal, which is delayed by the frame interval in predictor delay circuit 24 to become a delayed predicted signal on conductor 25. The delayed predicted signal on conductor 25 is applied to the inverting input terminal of subtractor 16 and to the input terminal of adder 22, as mentioned. As described, each pixel is characterized by a single value, which may be considered to be the luminance of a monochrome (black-and-white) picture. Those skilled in the art will realize that it may also represent the intensity of any one of a plurality of components of a color signal.
As mentioned, coarse quantizer 18 causes all input values of the difference signal to be coarsely quantized into "bins". The number of such non-zero bins is often a power of 2 such as 16 or 32. In addition to these non-zero bins, there is a center or "coring" bin into which difference signals having zero or near zero magnitude will fall, and will appear as a zero at the output of the quantizer. The designer expects that a large number, or all, of the difference pixels will fall into this coring bin. When the differences are at or near zero, this indicates that the predicted signal produced on conductor 23 at the output of adder 22 is a faithful representation of the image or picture currently to be transmitted. Subtractor 16 ordinarily receives digital signals which represent each pixel of the image to be transmitted by digital words having an arithmetic value (ordinarily expressed as the decimal value corresponding to the digital value). The arithmetic value is nominally equal to the normalized image luminance at the pixel location. The delayed predicted signal applied to the inverting input terminal of subtractor 16 from predictor and delay circuit 24 must, in order to be consistent with the digital signal representing the image which is applied to the noninverting input terminal, also represent the predicted image by the arithmetic value. Similarly, adder 22 must receive at its inputs the arithmetic value of each pixel pair to be added. The arithmetic value of each pixel is represented by the sequence of binary (two-level) logic levels (high and low, or logic one and logic zero) which appear simultaneously on each of a plurality of conductors (one "bit" for each conductor). For television signals, the number of conductors 15 connected to the non-inverting input of subtractor 16 is often eight, and the arithmetic value of a pixel is represented by one of 2.sup.8 or 256 possible numbers, each separate possible value corresponding to a unique distribution of logic highs and logic lows on the eight conductors. The result of the subtraction process in subtractor 16 is such that the resulting signal on conductor 17 is represented by 9 bits, or 512 possible numbers. However, the function of the coarse quantizer is to reduce the number of possible levels of the difference signal to a smaller value than 512, as for example 16 non zero bins plus the coring bin. Thus, all difference pixel arithmetic values ranging from (for example) 0 to 5 are forced by quantizer 18 to fall into the coring bin, and values ranging from 6 to 10 are forced into bin 1. Difference pixel arithmetic values of negative sign are also assigned bins. Since there are only 16 non-zero bins in this example, they can be represented by as few as 4 conductors (4 bits). If four-bit digital numbers are used to represent bins, the bin numbers are not actual arithmetic values, and cannot be applied directly to adder 22. The desire to process relatively small 4-bit "bin" numbers, coupled with the need to apply proper arithmetic values to adder 22, often results in an arrangement (not illustrated) in which the coarse quantizer has two output conductor sets, one set coupled to the adder which has a large number of conductors or bits (such as nine) for representing actual arithmetic values, and the other set having fewer conductors or bits (3 or 4) for coupling the corresponding "bin" number to a "coder". The close relationship between the arithmetic value and the bin value is known to those in the art, and the description hereinafter assumes that the quantizer has a single output which produces arithmetic values in the form of parallel 8-bit difference signals which are used by both the adder and the coder. However, the invention is not so limited, and may be used in systems in which bin numbers are used.
The difference or error signal on conductor 19 is applied, as mentioned, to a coding circuit illustrated as a coder 26. Coder 26 encodes the difference signal in known fashion, as by run length coding and/or Huffman coding. Run-length coding drastically reduces the number of bits which are required to be transmitted over data channel 30, by counting the number of successive pixels from coarse quantizer 18 which are at or near zero value (which are in the coring bin). In the above example, pixel amplitudes or values from 0 to 5 are in the first or coring bin, and are assumed to be near enough to the predicted value so that the difference signal can be zero. If prediction and delay circuit 24 is very effective, and especially if there is little motion in the television scene, it produces signal which is very similar to the signal currently to be transmitted, so the differences are mostly zero. If there are long runs (for example, A pixels in length) of zero-value differences, the run of A pixels can be represented by a single codeword which means "the current image is the same as the predicted image for these A pixels". Thus, one codeword of, say, 20 bits, can represent any number of 9-bit pixel difference values. If the run length is for example 100 pixels, the amount of data required to be transmitted to represent the image is reduced from 900 bits (9 bits per pixel.times.100 pixels) to 20 bits (the number of bits in a representative maximum-length codeword). In addition to signals representing zero run lengths, signals representing the amplitudes of at least some difference pixels must be sent over channel 30 to the receiver. Such amplitudes are often coded by Huffman coding, in which the frequency of occurrence of various amplitudes or bins is evaluated, and codewords are assigned to each amplitude, with the codewords being shorter for the more frequently-occurring values and longer for infrequently-occurring amplitudes.
It should be clear that the rate of generation of codewords in coder 26 is highly variable and depends upon the picture which is represented by the signals being coded. In a completely still image, there will be very long runs of zero difference signals, which can be represented by a single codeword, which occurs at the end of the run. On the other hand, if the image is highly variable, as for example when a transition occurs between two very different scenes, each in violent motion, there will be few long runs of zeroes, and many unlike amplitude values will occur, requiring frequent generation of relatively long Huffman codewords.
In order to eliminate the variability of the data rate, a rate buffer 28 is coupled to coder 26 for receiving or being laden (loaded) with coded difference data at a variable rate, for temporarily storing the coded difference data, and for applying the coded difference data at a constant rate through channel 30 to receiver 38. This type of buffer is often known as a first-in, first-out (FIFO) memory.
Receiver 38 receives coded difference data at a constant rate from channel 30, and stores the coded difference data in a rate buffer 48. Data is supplied therefrom as required to a decoder 46, which accepts the run length and Huffman-coded difference data at a variable rate, and decodes it into difference or error signals available on conductor 59, exactly corresponding to the signals which were available on conductor 19 of transmitter 10 (except for transmission errors, which are not considered herein). The decoded difference signals are applied to an input terminal of a summer or adder 42 of a predictor loop designated generally as 40. Adder 42 adds together the difference signal appearing on conductor 59 and the delayed predicted signal appearing on conductor 45, to produce a new predicted signal on a conductor 43, which is applied to a digital-to-analog converter (DAC) 54 for generating an analog signal, which is applied to a television display circuit illustrated as a block 52 for display of the picture. The new predicted signal is also applied from conductor 43 to a predictor and delay circuit 44 which is identical to predictor and delay circuit 24 of transmitter 10. Since predictor and delay circuit 44 is identical to predictor and delay circuit 24, the new predicted signal on conductor 43 appears on conductor 45 after a corresponding delay, which in the example is one frame interval. The resulting delayed predicted signal on conductor 45 is applied to adder 42, as mentioned.
The signal on conductor 19 of transmitter 10 and the signal on conductor 59 of receiver 38 are identical (except for a time lag due to the time required for transmission therebetween), because decoder 46 performs a transformation which is the precise inverse of that performed by coder 26. Difference signals applied by conductor 19 to adder 22 are therefore identical to the signals applied from conductor 59 to adder 42, and since predictor 20 is identical to predictor 40, the new predicted signals produced on conductors 23 and 43 are identical, except for the transmission time lag. Since predictor and delay circuits 24 and 44 are identical, and each receives the new predicted signal at its input, each produces identical delayed predicted signals on its output conductor (25 and 45). Thus, transmitter 10 produces on conductor 23 a signal identical to that which receiver 38 currently produces for display. For this purpose, the term "currently" does not refer to concurrence in time, but rather to concurrence of television frame number and raster position. Consequently, transmitter 10 always has available to it at the inverting input of subtractor 16 a delayed predicted signal identical to that generated by receiver 38 for the corresponding pixel of the previous frame. Therefore, the difference signal being transmitted at any moment from transmitter 10 is the difference between the television signal then being applied on conductor 15 to subtractor 16, from which is subtracted a signal corresponding to that produced and displayed by receiver 38 for the previous frame. It should be noted that during system design experimentation relating to predictor and quantizer effects, a receiver 38 may not be used; the signal on conductor 23 of the transmitter is considered to be a replica of the signal produced on conductor 43 by such a receiver.
Predictive systems such as that illustrated in FIG. 1 can achieve very large reductions in data rate, especially on still pictures. However, when the picture has motion, the predicted signal may at times be most unlike the actual current value. When there is substantial motion in the television picture, the difference or error signals on conductor 19 tend to be large in value and to change rapidly. As mentioned, run length coding tends to be relatively less effective in reducing data rate, and Huffman coding tends to produce relatively longer code words. Since the data rate of channel 30 is preestablished and rate buffer 28 of transmitter 10 can only transmit data at the maximum rate allowed by channel 30, it is possible for rate buffer 28 to become overfull or to "overflow" when the average size of the code word length is large, and code words are applied to the rate buffer for a long period of time at a high rate. The terms "overfull" and "overflow," may not be sufficiently descriptive. The rate buffer is "laden" or loaded by the difference between the variable flow of code words into the buffer and the fixed flow of code words out of the buffer, which forms a "lading" or loading which varies with time. The capacity of the buffer is the maximum lading which it can hold. The lading may from moment to moment vary from zero (empty buffer) to the maximum capacity of the buffer (corresponding to a full buffer). Any attempt to further increase the lading beyond the maximum capacity, even by one word, creates an "overflow" condition. "Underflow" occurs when the buffer writes or attempts to write to the outside world a number of bits which exceeds the number of bits in the lading, with the result that meaningless zero values are transmitted as meaningful data. When the lading is such that underflow or overflow occurs, some code words may not be stored in rate buffer 28, or are corrupted, and are therefore lost. The loss or corruption of code words is very serious in a predictive encoding type of communication system, and leads to substantial errors in data transmission and consequent distortions of the transmitted picture.
lt should be noted that the quantizer (18) in these loops is recognized as being a nonlinear element, which makes rigorous analysis difficult. Furthermore, the quantizer may have quantizing steps of different sizes, which increases the nonlinearity. However, ignoring the nonlinearity in the analysis produces results which, while not rigorous, indicate trends, and which can therefore be useful.
A known method for stabilizing the lading of the rate buffer (and therefore preventing exceeding the capacity of the buffer by underflow or overflow) is to sense the occupancy or the amount of lading of the rate buffer, and to generate a control signal in response thereto which is applied to at least one of the elements of the predictive coding system which produces the coded difference signal to reduce the rate of generation of the code words when the control signal indicates that the buffer is above or below a certain lading level.
Copending patent application Ser. No. 913,692, filed Sep. 30, 1986, entitled "Rate Buffer Control Of Difference Signal Decimation And Interpolation For Adaptive Differential Pulse Code Modulator", and Ser. No. 920,294, filed Oct. 17, 1986, and entitled "Rate Buffer Control Of Predicted Signal Decimation and Interpolation For Adaptive Differential Pulse Code Modulator", both in the name of A. A. Acampora, describe adaptive control of filters for reducing image resolution, and decimators and interpolators for reducing data rate, both under the control of the fill or occupancy of the rate buffer, for the purpose of prevention of overflow of the rate buffer at the transmitter. U.S. Pat. No. 4,093,962 issued June 6, 1978, to Ishiguro et al., describes adaptive control of the amplitude of the difference signal in response to rate buffer occupancy. U.S. Pat. No. 3,670,096 issued June 13, 1972, to Candy et al., describes a system in which a rate buffer overload signal causes encoding to stop for a period, and also causes cropping of the picture edges.
When a still image has been transmitted for a substantial time, the difference signals tend towards zero, and the encoding becomes very efficient. This efficiency results in transmission of relatively short codewords at infrequent intervals. If the scene of the image changes drastically and thereafter contains motion, large numbers of relatively large codewords are generated by coder 26, as described. These codewords are generated at the pixel clock rate, and may include codewords 10, 13 or as much as 20 bits in length. Thus, coder 26 encodes the difference signals to generate codewords during intervals in which a very high degree of image motion occurs which may have an average number of bits equal to or greater than the number of bits in uncoded, ordinary pulse code modulation (PCM). Since the codewords are relatively long and the rate of transmission over channel 30 is fixed by a modem or channel clock rate, relatively few codewords per unit time are transmitted from transmitter 10 and received at receiver 38. Consequently, during periods of intense image motion, rate buffer 38 receives on an average long codewords which represent few pixels. An unobvious problem results from this situation. The bits of the codewords must be sent from rate buffer 28 over channel 30 one at a time or serially. Thus, each long codeword may take as long as 20 modem or channel clock intervals to be transmitted through the channel and loaded into rate buffer 48 at the receiver. This time is the time required to clock rate buffers 28 and 48, excluding any transit delay time through the channel. Rate buffer 48 must form the serially received long codewords into parallel format, and supply them to decoder 46 as they are demanded by decoder 46. The demand is at the pixel clock rate, which is much higher than the channel clock rate. Decoder 46 must decode codewords supplied to it from buffer 48. The codewords may range in size up to 20 bits, and are supplied in parallel from rate buffer 48 to decoder 46. Under the described condition of intense motion, rate buffer 48 in receiver 38 may run out of codewords or underflow, and thereby not be able to supply codewords to decoder 46 fast enough to keep up with the demand. This occurs because, when codewords are long, reception by buffer 48 of each coded pixel may require as many as 20 channel clock intervals. Parallel processing cannot shorten this time, because the bits of each codeword are received at receiver 38 and rate buffer 48 sequentially, and the codeword cannot be supplied by rate buffer 48 to decoder 46 until the bits have all arrived. The basic reason that rate buffer 48 runs out of codewords is that the long codewords most often carry information relating to one or a very few pixels. Thus, the long codewords require a long transmission time (up to 20 channel clock intervals), but they tend to be read out of buffer 48 very quickly, namely at the pixel clock rate, and up to 20 at a time. Because of their length, the long codewords tend to occupy more buffer space than short codewords. Even if buffer 48 is relatively full of these long codewords, they tend to be read out so quickly (as much as 20 bits at a time, during each pixel clock cycle) that the buffer occupancy falls during the interval in which they are read out. If the rate buffer does not contain a sufficient backlog lading, the reading of the large codewords will deplete its contents to an empty condition, and then either attempt to underflow, causing errors, or be unable to supply the next pixel, which also results in errors. Since receiver 38 must generate pictures by sequentially producing pixels at the clock rate, rate buffer 48 must always have a codeword available on demand from the decoder when it is needed to produce the next pixel. If rate buffer 48 is empty at the moment of demand by decoder 46, a codeword may not become available for up to another 20 clock (pixel) intervals, and even when it completes its arrival at buffer 48 and is supplied to decoder 46, it may only satisfy the demand for one pixel or clock interval. Thus, the long codewords which occur during rapid motion of the image represented by the scene being televised may result in receiver 38 being unable to replicate in predictor and delay circuit 44 the signal stored in predictor and delay circuit 24 of the transmitter, because pixels have been corrupted or missed. It is very undesirable to allow the receiver to corrupt or to miss pixels, because in that event the predictors at the transmitter and receiver do not contain the same information, and the image displayed at the receiver will thereafter not correspond to the image being transmitted.
The above mentioned problem can be solved by keeping a long backlog, for example many frames, stored in rate buffer 48. For example, a buffer having a maximum capacity of about 20 coded average frames might keep an average backlog of 10 frames. Such a long backlog can guarantee that even for several rapid-motion frames, receiver rate buffer 48 will always have pixels available, even though its occupancy drops. However, this introduces another problem, that of delay in the transmission time. In a teleconferencing system using a satellite transmission path, a delay in each rate buffer of 10 television frame intervals, each 1/30 second long, results in an undesirably long net system delay. This can be explained by imagining a teleconferencing system in which a question is asked at one transmitter-receiver, and a reply is expected from the user of the other transmitter-receiver. The buffer at the first transmitter-receiver of the system delays by 1/3 second, and the round-trip delay to and from the satellite is about 1/4 second. Thus, the image generated (and the associated sound bearing the question) at the first transmitter-receiver arrives at the second transmitter-receiver after about 7/12 of a second, but is not displayed (or heard) for another 1/3 second, so that total delay between generation of an image and its display is about one second. The return image and sound also takes about one second to be reproduced at the first transmitter-receiver. Thus, there is roughly a two-second inherent delay in such a system which is attributable to the passage of the signal through four rate buffers, even for a relatively moderate 10-frame storage in the rate buffers. In a conversational context, this is a very undesirable delay.
There is an additional unobvious problem which can occur even when there is a substantial backlog of information in the receiver rate buffer. This problem is related to the speed with which decoder 46 can decode long codewords. In principle, the decoder can operate at the pixel clock rate. In practice, it may be desirable to design decoder 46 to operate in such a manner that its average decoding speed is sufficient for typical picture content, but the decoding of very long codewords takes more than one pixel duration, and to further provide a buffer memory to temporarily store the decoded difference words and to provide them to the receiver adder at the pixel rate. Thus, even though an occasional long codeword takes more than a pixel interval to decode, the decoder buffer supplies the difference words at the pixel rate during the lag. However, for images with violent motion, a frame will arrive in which almost every coded difference signal received by decoder 46 is long, which will result the decoder lagging far behind the decoder buffer, to the point that the decoder buffer may underflow or be unable to supply difference signals at the pixel rate, which as mentioned results in erroneous displayed pictures. This problem can also be ameliorated by keeping a large backlog in the decoder buffer, but adds to the conversational delay.
Thus, there is a conflict between the need for short delays and the need to keep the receiver and/or coder buffer from running out of pixels. It is highly desirable to minimize the number of frames stored in each rate buffer in order to minimize the overall system delay. Thus, it is very undesirable to ameliorate the problem of running out of pixels in the rate buffer by increasing the average number of frames stored in the rate buffers.