In recent years, to improve security of some important places (such as airports), a video surveillance system is usually deployed to perform surveillance on these important places. A typical video surveillance system includes a surveillance client, a surveillance platform, and a camera, where the surveillance client is a graphical user interface and can implement interaction with a user; the surveillance platform is a background service component and can complete surveillance-related service functions, such as real-time browsing and intelligent analysis; and the camera is a front-end video capture device and is used to capture and analyze a real-time image of a current place. The video surveillance system configures an intelligent analysis rule for perimeter intrusion, and generates an alarm with respect to a moving object that intrudes into a pre-configured perimeter range, to prompt a user to handle the intrusion. For a large-range surveillance scene in which a moving object occupies a small proportion of the picture and details cannot be seen clearly, a multi-target tracking system is used in the prior art (that is, a fixed box camera and a high-speed pan-tilt-zoom (PTZ) camera are deployed; once the box camera captures an alarm picture, the box camera is linked to the PTZ camera immediately, and uses the PTZ camera to perform short-distance tracking of the moving object) to implement local magnification of the moving object.
However, during implementation of the foregoing video surveillance, two cameras need to be deployed to implement multi-target tracking, causing high deployment costs and relatively heavy workload of installation and commissioning. In addition, only analog cameras can be deployed as the box camera and the PTZ camera in the multi-target tracking system, and network cameras cannot implement multi-target tracking, and therefore cannot implement local magnification of the moving object.