This invention relates to a video analysis method and a video analysis device for analyzing a succession of video data or images to detect a plurality of moving objects, such as up to about fifty moving objects which are moving in the video data.
Such a video analysis device is used in a monitoring system for automatically recognizing moving vehicles and walking persons. For example, a method of detecting walking persons in a cinematic sequence of frames is described by Akio Shio and Jack Sklansky in "Konpyuta Bizyon" (Computer Vision), 75-5, (1991-11-22, eight pages) as an article in the Japanese Language with an English title of "Segmentation of People in Motion" and an Abstract in English. In this video analysis method, a background image is first obtained from frame images of the sequence to extract at least two region images. By a block matching technique, motion vectors are detected in the region images. Assuming that region images of similar motion vectors represent a moving object with a high probability degree, the region images are divided into regions of walking persons.
The video analysis is recently used in editing motion video images. An example is described in a paper contributed by Hirotaka Ueda and two others to the 1993 Spring General Meeting of the Electronic Information and Communication Institute of Japan, Paper No. SD-9-1, under the English title of "Automatic Object Linkage by Video Analysis and its Application" as translated by the contributors. According to this paper, a user first specifies an object in one of successive frames. A video analysis device detects objects of identical color combinations automatically in other frames to form a hyperlink.
The technique of the Shio et al article aims at correct detection of the regions of walking persons and an accurate follow of these regions. For this purpose, it takes much time to deal with the frame images of the cinematic sequence. In order to process the frame images at a high speed, it is necessary in most cases to use specific image processing hardware.
The technique of the Ueda et al paper is featured by simple processing. This technique is, however, liable to occurrence of errors and gives results of an objectionable precision. This is because no consideration is taken on correspondence between each moving object and successive frame data.