The invention relates to a circuit arrangement for recognizing a human face in a sequence of video pictures comprising
first means provided for structuring the video pictures block-wise and for subtracting corresponding blocks of two consecutive pictures one from the other,
second means for post-processing of the difference pictures produced by the first means, and having further means and features mentioned in the precharacterizing part of claim 1.
The mode of operation of a circuit arrangement having these properties is disclosed in a lecture given during the Picture Coding Symposium on Mar. 26th to 28th, 1990 in Cambridge, Mass., U.S.A. (cf. Eric Badique: Knowledge-based Facial Area Recognition and Improved Coding in a CCITT-Compatible Low-Bitrate Video-Codec. Proceedings of Picture Coding Symposium 90, Cambridge, Mass.).
A comparable circuit arrangement is described in the EP-A2-0 330 455, which is also used in, for example, view telephones. The reason to use circuit arrangements of this type in view phones having a low transmission rate (for example 64 kbits/s) is that during a conversation the eyes of a speaker mainly observe the face and more specifically the eyes and the mouth of the other speaker and that this fact can be utilized to obtain a subjective improvement of the picture quality, without an increase in the transmission rate. Such an improvement is namely obtained when the eyes and mouth sections of a face are encoded, at the expense of other regions of the picture, with a higher accuracy,--i.e. with more bits--than other parts. Utilizing this effect is however only possible when it is known in advance whether a face can be present or not present in a sequence of video pictures.
The arrangement described in the EP-A2-0 330 455 utilizes for the said purpose the difference between two consecutive video pictures. Signal portions differing from zero appear only in the difference picture when the picture sequence includes moving objects. In the said Patent Application the sequence of difference pictures is post-processed in a manner which makes the possibility of recognition errors probable.