A wireless local area network (WLAN) has advantages such as low costs and convenient deployment, and can meet technical and cost requirements of a wireless video surveillance network. In wireless video surveillance, people have increasing requirements for high-definition video streams. However, a high-definition video stream has a relatively high data rate, leading to relatively high network load when multiple video surveillance terminals concurrently transmit high-definition video streams. For example, a high-definition video conference in a 720p encoding format has a typical data rate of 0.5 megabits per second (Mbps) to 2.5 Mbps. Assuming that high-definition video streams in the 720p encoding format have an average rate of approximately 1.5 Mbps, an aggregate data rate of 15 video surveillance terminals reaches 22.5 Mbps. However, according to an 8-megahertz (MHz) channel modulation and coding scheme (MCS) design of a physical layer in the 802.11ah standard, even if the 15 stations (STAs) all can use high-order 64-quadrature amplitude modulation (QAM) (a bit rate is 2/3), a maximum value of an aggregate throughput of a network is still only 23.4 Mbps. Therefore, when a wireless high-definition video surveillance service is running on a frequency band of 779 MHz to 787 MHz that is allocated to China and that is below 1 gigahertz (GHz), because maximum usable bandwidth is finite, a wireless video surveillance network often runs in a saturated state. The wireless video surveillance network ran in the saturated state cannot have an excessively large load change range. Otherwise, a peak throughput of the network exceeds a load tolerance of the network.
In a wireless video surveillance application, a mainstream video coding standard is H.264, whose video coding outputs include an intra frame (I frame), a predictive frame (P frame), and a bi-directional interpolated prediction frame (B frame). Because the size of the I frame is generally 8 to 10 times that of the P frame and the B frame, when picture frame intervals are uniform, the I frame has a much higher encoding output rate than the P frame and the B frame. When multiple STAs send I frames at a same time or time close to each other, a network load peak rate is excessively high, and exceeds a network load tolerance. As a result, not all latency quality of service (QoS) requirements of multiple real-time video data streams can be met.