1. Field of the Invention
The invention relates to a method of prediction coding utilizing advance knowledge that is already present in a video receiver in order to describe a current frame with as little information as possible, and more particularly to an improved method of prediction coding in which the frame-to-frame changes resulting from motion in a scene depicted in a frame are detected spatially and temporally and also included in the calculation of the predicted frame, that is, prediction coding with motion compensation.
2. Prior Art
For transmission of video telephone scenes at transmission rates between 64K bit/s and 384K bit/s, two methods of motion compensation have proved to be particularly successful. The first method for motion compensation, known for example from European patent No. 0 123 616, is block-by-block motion compensation. It schematically scans through the entire current frame to be predicted (this is known as iconic frame processing), dividing the frame into a regular block of n.times.n pixels and assigning to each block the frame-to-frame displacement for which an interval measurement (correlation measurement) between the gray value pattern of the observed block and the gray value pattern of a correspondingly displaced block in the previous frame assumes a minimum (maximum). For prediction, a block from the most recently reconstructed frame (already present at the receiver) is used, which is offset with respect to the block to be predicted by the ascertained displacement.
The other method is object-related motion compensation. It is known, for instance, from German Patent Disclosure Document DE-OS No. 33 28 341. In principle it operates not schematically but rather in terms of picture content (known as semantic frame processing), in which the attempt is made to separate portions of the frame sequence that have moved from those that have not moved, and to characterized them in terms of their movement behavior, and in this manner to attain self-contained objects, defined with respect to one another, having a describable movement behavior, and finally, with the aid of the description of the movement behavior obtained, to provide a motion-compensated predicted frame of the current frame. This method is also known as object matching. For further information on this prior art, see among others the paper entitled "Coding Television Signals at 320 and 64K bit/s" (Second Int. Tech. Symp. on Optical and Electro-Optical Appl. Science and Engineering, SPIE Conf. B594 Image Coding, Cannes, France, Dec. 1985).
Codes for the extreme reduction of the transmission rate, which are required for the above-mentioned bit rates, for example, must be adapted to a restricted scene material that they have to process. Accordingly they cannot be universally used, nor can they be evaluated with respect to arbitrary scenes. It is characteristic of video telephone scenes and video conference scenes that there are variations in the regularity of the frame-to-frame changes between large-area variations, such as when zooming by increasing the focal length of the camera lens, and predominantly random variations, such as those presented by a person making large gestures with rapidly changing facial expressions, who is moreover wearing highly patterned clothing with a great amount of drapery.
Each of these two methods for motion compensation reveals its weaknesses completely when such extreme changes are coded. This will be explained below with respect to three criteria for quality.
A first criterion for quality of motion compensation is the ratio of the square deviation, averaged over all pixels, for either the motion-compensated predicted frame or the most recently recontructed frame of the current original frame.
Object-related motion compensation attains the greatest yield in the prediction of frame-to-frame changes, which are brought about by motion of large objects in the scene space, which predominantly obey the laws of translation in the scene space: lateral, forward or backward movements of the entire reproduced portion of an otherwise unmoving person. In the case where the camera is panning or zooming, object matching motion compensation is advantageous, because all the pixels already present in the previous frame appear once again, displaced, in the current in accordance with a uniform, simple principle.
Block-by-block motion compensation, contrarily, is advantageous when there are random frame variations.
The second decisive criterion of quality for motion compensation is the data rate required for transmitting the control data for the motion-compensated predictor and for transmitting the so-called prediction error signal for correcting the errors remaining in the predicted frame.
While in object matching a set of control data (known as the characteristic motion vector) for each closed moved object can be transmitted with the accuracy of at least 1/10 pixel interval in order to enable optimal motion-compensated estimated frame calculation, this is impossible with block matching, because block matching requires the transmission of a displacement vector for each block that has varied from one frame to the next. This would increase the control data flow excessively, to the disadvantage of the transmissible prediction error signal. Consequently, the only realizable block displacements are integral multiples of the pixel interval in each coordinate direction, with the consequence of a smaller yield from motion compensation.
A third criterion of quality, which in contrast to the two objective criteria given above is a subjective criterion of quality, is the errors still remaining in the reconstructed frame that have not been corrected because of the limited data rate. Among these remaining errors, those that are considered disruptive are those which by misadaptation of the rigid block structure to the natural limits of the various moved portions of the frame are artificially introduced into their reconstructed frames and are therefore also called artefacts.
In object matching, these artefacts occur only at the boundaries of the closed objects, while in block matching they basically can occur anywhere that differently motion-compensated blocks are adjacent one another.