In the field of social public security, a video surveillance system is an important part of maintaining social order and strengthening public administration. At present, surveillance systems have been widely applied to public places such as a bank, a shopping mall, a bus station, an underground parking lot, and a traffic intersection. Much manual support is still needed in a real-life surveillance task. A surveillance camera generates a large amount of video data when performing continuous monitoring in 24 hours per day. In this case, when it is required to search for evidence from a surveillance video, it will inevitably consume a large amount of labor, time, and material resources, which causes an extremely low efficiency and even leads to a miss of a best opportunity to solve a case. Therefore, in the video surveillance system, a playback time of a video event is shortened by synopsizing the video, and an object to be retrieved may be quickly browsed and locked by classifying objects for filtering, thereby greatly improving surveillance efficiency, which is of vital importance in helping, for example, the police, to accelerate the process of solving a case and improve efficiency in solving major and serious cases which are more complex. The prior art provides a video synopsis method. By analyzing scene information in an input video stream and extracting foreground information from the scene information, object information in the scene information is obtained by performing a cluster analysis of the acquired foreground information, and further a corresponding synopsized video is generated. In the generated synopsized video, a single frame of image includes object information acquired in different frames in the input video. The prior art performs synopsis in a unit of frame, causing that a synopsized video is displayed in a single manner, and different representations of different tracks cannot be reflected.