As video data increases massively, a user needs to consume a large amount of time and energy to browse videos one by one and classify videos according to motion information of human figures in the videos. Although currently videos can already be classified according to some simple motions such as walking and running in the videos, motions in videos such as sports are usually complex, and video classification according to simple motions already cannot satisfy user requirements. To enable videos to be classified according to relatively complex and continuous motions in the videos, in the prior art, features such as histogram of oriented gradients (HOG) in some local areas are extracted from the videos, clustering is performed according to these features to form motion atoms, where the motion atoms are simple motion patterns having some similarities, responses between to-be-detected videos and these motion atoms are then calculated, obtained responses are used to form a vector, and the to-be-detected videos are classified according to the obtained vector.
However, complex motions having a strong time sequence relationship always appear in videos, and when to-be-detected videos are classified by using a vector obtained by using motion atoms, it is difficult to ensure classification precision. Therefore, in the prior art, another method is used. Complex motions in videos are divided by time into some fragments including simple motions, where each fragment corresponds to a time point. During classification, each fragment and a fragment obtained by dividing a sample are compared according to a time sequence, to obtain a comparison score of each fragment. A weighted sum of these comparison scores is calculated to obtain a final comparison score. The videos are classified according to the final comparison score.
However, for complex motions that are quite continuous and that last for a long time, in the prior art, it is very difficult to properly divide these complex motions into fragments including simple motions. In addition, when time points at which complex motions in videos are divided are set differently, comparison scores obtained after performing comparison with fragments obtained by dividing samples are also different. As a result, multiple different results are generated in video classification, it is difficult to unify different results, and precision of video classification is also relatively low.