The present invention relates to the compression of digital data, and more particularly to a method and apparatus for refreshing motion compensated digitized video signals.
Television signals are conventionally transmitted in analog form according to various standards adopted by particular countries. For example, the United States has adopted the standards of the National Television System Committee ("NTSC"). Most European countries have adopted either PAL (Phase Alternating Line) or SECAM standards.
Digital transmission of television signals can deliver video and audio services of much higher quality than analog techniques. Digital transmission schemes are particularly advantageous for signals that are broadcast by satellite to cable television affiliates and/or directly to home satellite television receivers. It is expected that digital television transmitter and receiver systems will replace existing analog systems just as digital compact discs have largely replaced analog phonograph records in the audio industry.
A substantial amount of digital data must be transmitted in any digital television system. This is particularly true where high definition television ("HDTV") is provided. In a digital television system, a subscriber receives the digital data stream via a receiver/descrambler that provides video, audio, and data to the subscriber. In order to most efficiently use the available radio frequency spectrum, it is advantageous to compress the digital television signals to minimize the amount of data that must be transmitted.
The video portion of a television signal comprises a sequence of video images (typically "frames") that together provide a moving picture. In digital television systems, each line of a video frame is defined by a sequence of digital data bits referred to as "pixels". A large amount of data is required to define each video frame of a television signal. For example, 7.4 megabits of data is required to provide one video frame at NTSC resolution. This assumes a 640 pixel by 480 line display is used with 8 bits of intensity value for each of the primary colors red, green, and blue. High definition television requires substantially more data to provide each video frame. In order to manage this amount of data, particularly for HDTV applications, the data must be compressed.
Video compression techniques enable the efficient transmission of digital video signals over conventional communication channels. Such techniques use compression algorithms that take advantage of the correlation among adjacent pixels in order to derive a more efficient representation of the important information in a video signal. The most powerful compression systems not only take advantage of spatial correlation, but can also utilize similarities among adjacent frames to further compact the data.
Motion compensation is one of the most effective tools for accounting for and reducing the amount of temporal redundancy in sequential video frames. One of the most effective ways to apply motion compensation in video compression applications is by differential encoding. In this case, the differences between two consecutive images (e.g., "frames") are attributed to simple movements. The encoder estimates or quantifies these movements by observing the two frames and sends the results to the decoder. The decoder uses the received information to transform the first frame, which is known, in such a way that it can be used to effectively predict the appearance of the second frame, which is unknown.
The encoder reproduces the same prediction frame as the decoder, and then sends the differences between the prediction frame and the actual frame. In this way, the amount of information needed to represent the image sequence can be significantly reduced, particularly when the motion estimation model closely resembles the frame to frame changes that actually occur. This technique can result in a significant reduction in the amount of data that needs to be transmitted once simple coding algorithms are applied to the prediction error signal. An example of such a motion compensated video compression system is described by Ericsson in "Fixed and Adaptive Predictors for Hybrid Predictive/Transform Coding", IEEE Transactions on Communications, Vol. COM-33, No. 12, December 1985.
A problem with differential encoding is that it is impossible to ensure that the prediction signals derived independently at the encoder and decoder sites are identical at all times. Differences can arise as a result of transmission errors or whenever one of the two units is initialized. Thus, for example, a television channel change will render the prior frame data meaningless with respect to a first frame of a new program signal.
To deal with this problem, it is necessary to provide some means of periodic refreshing. Two such methods are common. The first method is to scale the prediction signal by some constant .alpha. which is less than but almost equal to one. The difference between the actual image and the scaled prediction is then computed and transmitted to the decoder as before. If the encoder and decoder images are identical at a first frame interval, then they will remain identical after the next frame interval. However, if transmission errors or initial acquisition cause the initial error to be non-zero, then the error will continue to persist in the following frame. Therefore, all errors will be attenuated by the constant .alpha. with each passing frame, until eventually they are no longer visible. The duration of each error is controlled by .alpha.. On the one extreme, o can be forced to approach zero in order to prevent error propagation entirely. However, this eliminates the predictive element and its associated improvement in coding efficiency. The other extreme occurs when .alpha. approaches one, in which case errors will continue to persist indefinitely. Note that if .alpha. were to exceed one, then the magnitude of the errors would increase with time and the system would be unstable.
Unfortunately, the efficiency of most image compression algorithms decreases markedly as .alpha. is decreased to less than one. Such compression algorithms usually seek to compact the information in the signal into a small number of coefficients or samples. A small reduction in .alpha. can cause the magnitude of many of these coefficients to exceed their respective transmission thresholds. In most cases, coding efficiency is based not only on the size of the coefficients, but on the number that need to be transmitted.
Another method of refreshing the image is to periodically switch from differential mode ("DPCM") to nondifferential mode ("PCM"). For example, in a thirty frame per second system, the screen could be completely refreshed at one second intervals by inserting a PCM frame after every twenty-nine DPCM frames. In this way, channel acquisition and the correction of transmission errors could be guaranteed after a delay of no more than one second. It is assumed here that the switch to PCM coding can be done without affecting the perceived quality of the reconstructed video. However, this is only possible in a variable bit rate encoding system using rate buffers to control fluctuations in the input and output data rates. Such a system is described by Chen and Pratt, in "Scene Adaptive Coder", IEEE Transactions on Communications, Vol. COM-32, No. 3, March 1984. Unfortunately, the resulting large number of bits due to the less efficient PCM encoding is difficult for the encoder buffer to handle, and measures used to control it may cause visible artifacts to appear in the reconstructed image.
To overcome this problem, segments or blocks of the image can be refreshed on a distributed basis. By assigning a different counter to each segment and systematically or randomly setting the initial count for each one, it is possible to attain the same refresh interval while maintaining a constant distribution of bits. It is even possible to eliminate the counters and instead, randomly refresh each segment based on a suitable probability distribution.
These methods work well if the predictor frame is set to be identical to the previous frame. However, once motion compensation is introduced, a new problem arises. The motion estimator does not limit the block displacements in such a way as to prevent overlap between refreshed and nonrefreshed regions of the image. For example, if one region of the image is refreshed during the transmission of a given frame, then there will exist an adjacent region in the same frame that has not yet been refreshed but is due to be refreshed during the next frame interval. Obviously, this unrefreshed region is much more likely to contain at least one error. Therefore, if we use this less reliable data in the unrefreshed region to predict the appearance of certain segments of the next frame, then those segments of that frame will also be subject to errors. It is therefore possible that a recently refreshed region will cease to be accurate after only one frame. In a motion compensated system, this result tends to occur whenever there is movement from an unrefreshed region to a refreshed region, causing a recently refreshed segment of the image to immediately diverge from the corresponding encoder segment, even though no transmission errors occur. Once again, the acquisition time and the duration of artifacts due to transmission errors can become unbounded.
It would be advantageous to provide a method for refreshing motion compensated sequential video frames that does not suffer from the above-mentioned problems. In particular, it would be advantageous to provide a solution that avoids large fluctuations in the compression rate while limiting the refresh interval to a reasonable bound. The present invention provides such a solution.