1. Field of the Invention
The present invention relates to an object recognition apparatus and an object recognition method. More particularly, the present invention relates to an apparatus and method for recognizing an object or objects that have invaded into a moving picture from an output of a moving picture encoding device or a moving picture decoding device.
2. Description of the Related Art
In general, it is necessary to investigate a pixel value in order to detect a specific object in a moving picture and recognize the object. For example, in Akio Okazaki, “Beginners Guide to Image Processing Technique”, Kogyo Chosakai, pp. 102–103, 2000, there has been introduced a process for isolating a moving object based on a background differential. In this technique, a differential value in pixel vales between a reference background image and an input image is binarized by a threshold, thereby achieving isolation of the moving object. However, there is a problem that such a process concerning a pixel value requires a large amount of computation. For example, in the case of a CIF format which is frequently used in ITU-T H. 261, H 263, ISO/IEC MPEG-4 or the like that is a standard scheme for encoding a moving picture, processing has been necessary for a total of 101376 pixels that are horizontal 352 pixels and vertical 288 pixels. For such a process with a large amount of computation, it has been necessary to provide dedicated hardware. Thus, there has been a large problem on an aspect of cost efficiency.
As a technique for detecting a moving object in a moving picture, in Jpn. Pat. Appln. KOKAI Publication No. 9-252467, “Moving Object Detecting Apparatus”, there has been proposed a method using a motion vector produced by a moving picture encoding device. According to this method, the motion vector for each macro-block produced by the moving picture encoding device is employed. Thus, there is no need to particularly investigate the motion of a pixel in order to detect a moving object, and an amount of computation can be significantly reduced.
However, the following problem has occurred with a conventional technique for detecting a moving object using encoded data. That is, a macro-block whose motion vector is large or a rewritten macro-block is not always limited to a moving object. In addition, even in a macro-block in the moving object, a block which has not been rewritten exists. Thus, when this technique is used for monitoring, a necessary video image has not always been acquired successfully.
Further, in detection for each macro-block, when a target object is included partly in the macro-block, there has been a problem that such inclusion is missed because an error is too small for each macro-block. Specifically, in a case shown in FIG. 8, since only a small error occurs at parts of the head, left leg, and left arm of an invader 500 for each macro-block, it has been difficult to determine these parts as a portion of an invading object.
As has been described above, there has been a problem that a large amount of computation is required to detect a specific object and recognize what the object is. On the other hand, in the conventional technique using encoded data, there has been a problem that the recognition precision is not sufficient.