The present invention relates to a method for classifying the motion of an object such as a human being in a moving picture, and particularly to a method for classifying the motion of a non-rigid object in a moving picture by using a wavelet transformation.
Recently, the processing of video images are performed in various A fields and applications along with the widespread use of personal computers. From now on, a variety of processings of video image including moving pictures will take a further step as the communication speed, and the processing speed and storage capacity of the computer increase.
However, since a motion picture contains non-rigid objects, or objects which are not fixed to a stationary state, there is a problem that the technique for processing a still picture cannot be directly applied. For instance, an object in a still picture can easily be classified by comparing the object image taken out from the still picture with the reference images in a template, and determining the reference image that is most closely analogous to the object to be processed. However, since an object in moving picture is in motion, change with time needs to be taken into consideration to classify an object in a moving picture.
The identification and classification of an object in a moving picture can be used in various applications. One example is automatic indexing and image retrieval by the contents of an image. For instance, motions of an athlete, such as jump and kick, can be inputted as contents to automatically retrieve scenes including the contents from a video image sequence. This may be used for retrieving highlight scenes, making a digest image, or retrieving the database of moving pictures. Further, by classifying the time-varying motion of each player in a moving picture of a sport, the motion of the player can also be analyzed.
Another example is automatic monitoring. It can be used to monitor the movements of people in a security area, and automatically detect dubious movements, thereby for preventing crimes. Still another example is a function as a man-machine interface. The inputting of data or control information to a computer by gesture instead of by a keyboard or voice, and conversion of motions by finger language to a visible output, voice output, or braile output may be made possible.
Accordingly, it is considered that the identification or classification of an object in a moving picture will exploit various applications in the future. Specifically, automatic image indexing is considered to be an element technique for standardizing the contents description in the MPEG-7 standard (the draft is scheduled to be prepared in the year 2001), and there is a demand for the establishment of a technique for it.
Image processing usually requires a tremendous amount of calculations, and the processing of a moving picture needs more calculations than the processing of a still picture. Accordingly, the technique for classifying an object in a moving picture can preferably detect the motion of the object accurately and easily with less amount of data processing.
Although a technique for identifying or classifying an object in a moving picture by using a wavelet transformation has not been proposed as far as the pre sent inventors know, the references concerning the image processing using the wavelet transformation include the following for instance.
(1) H. Nakano ettal., xe2x80x9cMethod for Detection and Visualization of Macro Defects in Color Liquid Crystal Displays by Using Gabor Wavelets, xe2x80x9cPROCEEDINGS OF SPIE REPRINT, reprinted from Wavelet Applications in. Signal and Image Processing V, Jul. 30 -Aug. 1, 1997, Sandiego, Calif. Vol. 3169, pp. 505-516.
This reference is a paper written jointly with one of the present inventors, and it shows a method for detecting micro defects of a liquid crystal display by using 2D (two-dimensional) Gabor wavelet. The reference discloses equations 1 to 3 which are described later in this specification. However, it neither refers to the classification of the motion of an object in a moving picture, nor suggests how to apply equations 1 to 3, to the detection of the motion of the object.
(2) M. Oren et al., xe2x80x9cPedestrian Detection Using Wavelet Templates, xe2x80x9cProceedings of the 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997, pp. 193-199.
This reference shows a technique for classifying the pattern of a pedestrian by a wavelet transformation using a Haar function. However, the reference uses the wavelet transformation to detect a pedestrian from the background which is a still picture, and classify the pattern of the pedestrian. It does not detect and classify the motion of an object in a moving picture. Further, it also does not address the wavelet transformation using the Gabor wavelet in the present invention.
(3) A. Gorghi et al., xe2x80x9cSequence Matching Using a Spatio-Temporal Wavelet Decomposition, Proceedings of SPIE, Vol. 3024, pt. 2, pp. 938-952, Feb. 1997
This reference shows a technique for performing the indexing and retrieval of an image sequence by using a wavelet transformation. However, the reference is to perform the wavelet transformation of the whole image of each of key frames, and perform the indexing of an image scene by the normalized correlation of wavelet expansion coefficients among the key frames. It does not discuss the classification of an object itself in a moving picture, and shows nothing on the wavelet transformation using the Gabor wavelet in the present invention.
(4) Published Unexamined Patent Application No. 9-231375 This reference shows a technique for detecting the motion of an image by using a wavelet transformation. The image of one frame is decomposed to eight pixel blocks, and each pixel block is transformed to a multi-resolution block by the wavelet transformation. The wavelet coefficients of the corresponding multi-resolution blocks of the previous and current frames are compared, and the existence of a motion is determined based on the difference between them. The motion detected in this way is used to specify a region to be updated preferentially. The reference does not refer to the classification of the motion of an object in a moving picture. In addition, it also does not show the wavelet transformation using the Gabor wavelet in the present invention.
Problems to be Solved by the Invention
Accordingly, it is an object of the present invention to provide an effective method for classifying the motion of an object in a moving picture.
It is a further object of the present invention to provide a method for detecting and classifying an unknown pattern of an object by a wavelet transformation using a specific Gabor wavelet function.
The method for classifying an object in a moving picture according to the present invention comprises a step of preparing a template including the wavelet expansion coefficients of an image of the object in a plurality of frames of a video image sequence representing each of a plurality of reference motions of the object, a step of obtaining the wavelet expansion coefficients of an image of said object in a plurality of frames of a video image sequence representing an unknown motion of the object, a step of calculating the matching factors between the unknown motion and the reference motions based on the wavelet expansion coefficients for the unknown motion and the wavelet expansion coefficients for the reference motions in the template, and a step of classifying the unknown motion based on the matching factors.
Wavelet expansion coefficients are obtained based on a Gabor wavelet function. Preferably, the wavelet expansion coefficients are obtained at a plurality of selected sampling points of the object image, with the coordinate origin being set o approximately the center of the object. Further, the wavelet expansion coefficients are obtained at a plurality of scale transformation levels, and the number of sampling points is set to a different number for each of the levels. The calculation of the matching factors is preferably performed by assigning a predetermined weight to the expansion coefficients according to the scale transformation levels. Furthermore, the wavelet expansion coefficients are preferably obtained by a plurality of predetermined rotation positions with each sampling point being the center of rotation.