At present, a topology structure commonly used by a data center is a fat-tree (Fat-Tree) structure. In a fat-tree-based interconnection network, a leaf node represents a processor or a server, an internal node is a switch or a switching chip, and each line corresponds to a two-way link between upper-layer and lower-layer nodes. A route in a fat-tree totally includes two stages: a rise stage and a fall stage. For example, assuming that a leaf node processor i sends a data packet to a processor j, at the rise stage, the data packet rises along the fat-tree by using an internal switch/switching chip until reaching a common “ancestor” of the processor i and the processor j, where there may be more than one common “ancestor”; then, at the fall stage, the data packet is sent downward to the processor j by using the internal switch/switching chip.
It should be noted that, for a network including N nodes, having characteristics of total bandwidth bisection specifically refers to that the N nodes are arbitrarily divided into two parts, and provided that each part includes N/2 nodes, a sum of link bandwidth between the two parts is N/2 times of bandwidth of a single link. Therefore, the two parts may communicate with each other at full speed. If the total bandwidth bisection cannot be ensured, no route can be used for measurement to ensure that there is no congestion. Therefore, to enable the fat-tree to ensure the bandwidth bisection, the fat-tree has a strict limitation to a quantity of switches or switching chips at each layer. For example, assuming that in a fat-tree network, a quantity of ports of each switch or switching chip is 2n, a quantity of pods is 2n, in each pod, a quantity of access switches or switching chips is n, a quantity of aggregation switches or switching chips is also n, a quantity of processors or servers is n2, and in addition, a quantity of core switches or switching chips is n2. To ensure load balancing in the fat-tree network in the example, 5n2 switches or switching chips are needed to associate 2n3 processors or servers without congestion. Load balancing is performed on data from an access switch or switching chip to a core switch or switching chip (uplink), and then, the data is returned from the core switch or switching chip to the access switch or switching chip, so as to ensure 100% throughput of the entire network. This manner is easy to implement for unicast scheduling, but is difficult to implement for multicast scheduling.
In the prior art, in a multicast solution of a fat-tree network, a bounded congestion multicast scheduling (BCMS) algorithm is mainly used. The BCMS algorithm is a centralized algorithm, and controls a main frame of a data center mainly by using open-flow (Open-Flow). In the BCMS algorithm, a central scheduler collects a bandwidth requirement of traffic of a mobile phone, monitors a network situation of the data center, that is, current available bandwidth of a network link, calculates a routing path of each flow, and configures a switch. A main feature of the BCMS algorithm is that it may ensure that a congestion degree of any link in a fat-tree network is bounded. Specifically, it may be defined that under a capacity-admissible data source, the BCMS algorithm makes a maximum congestion degree of any link in a fat-tree network be C, and a value of C depends on values of m, n, r, and s of the fat-tree network, where m is a quantity of core switches at a top layer, r is a quantity of edge switches at a bottom layer, n is a quantity of terminals connected to each edge switch, and s is link bandwidth of a core switch. That is, for a given fat-tree network, C is determined, and the value is determined. In the fat-tree network, each multicast traffic must pass through an uplink of a routing switch to reach the core switch. Therefore, a first step of the BCMS algorithm is to: determine an uplink of a multicast edge switch, and set an applicable core switch. After being sent to the core switch in a multicast manner, a multicast stream may be forwarded to all destination edge switches through a downlink. Then, a second step is to iteratively find an appropriate subset of the core switch, and to send downlink data by traversing each port in the subset in the multicast manner. Specifically, the appropriate subset of the core switch may be selected by using a greedy policy of a minimum cardinality.
However, if the BCMS algorithm in the prior art is used, each step of the multicast scheduling based on the fat-tree network needs global information, and then, global calculation is performed, resulting in a single point failure and a severe performance limitation in the entire network.