Software-Defined Networking (SDN) makes network management easier and more flexible, especially in a datacenter environment. It facilitates implementing customized network functionality by separating the control plane (which decides how to handle traffic in network) from the data plane (which forwards traffic as instructed by the control plane) in order to centralize network's intelligence and state. Thus, it abstracts the complexity of the underlying physical network, and allows network engineers to focus on optimizing network operations. Under the SDN paradigm, network operators can specify high-level network policies, which are automatically translated into low-level rules/instructions and installed in network switches by a logically centralized controller. The controller can use OpenFlow, which is a standard interface for the communication between the SDN controller and the SDN switches. OpenFlow defines primitive instructions for programming the SDN switch and controls their forwarding behavior by an external application running in the SDN controller.
Due to the scale and dynamic nature of datacenter networks, a network controller should adapt to the rapid change in network configurations as users/applications come and go, or when the network topology changes. The enhanced control offered by SDN aligns well with the nature and requirements of modern datacenter networks.
While this flexibility enables SDN programmers to develop innovative load balancing techniques, a scalable routing scheme, and traffic isolation methods in datacenter networks, it inherits similar challenges of debugging complex software. In addition to encountering misconfiguration errors and failures, SDN developers need to validate the correctness of their routing algorithm, identify the weaknesses in their load balancing technique, and/or evaluate the efficiency of the traffic monitoring application. Therefore, the SDN debugging method and troubleshooting tool are essential for understanding the network forwarding behavior and tracing the flow of traffic along the network paths. Such a debugging tool should help to localize network problems and verify whether the path taken by a packet/flow conforms to the network policies in the controller.
Traditional network tools such as NetFlow, sFlow, SNMP, and traceroute can be insufficient for debugging tasks in the SDN environment. As a result, a number of SDN tools have been developed to accomplish such tasks smoothly. Some of these tools require maintaining updated snapshots of the network-wide forwarding states. Therefore, they consistently collect the network configurations, either by dumping flow rules installed at switches or by capturing the control traffic between the controller and switches. However, in addition to the overhead of collecting such information about the network state, analyzing the network configurations alone cannot assist in detecting errors that are related to the data plane, such as bugs in switch firmware or limited memory space to enforce the configuration.
An alternative debugging approach for SDN is to trace the path(s) that was taken by a realistic flow from the source node to the destination node, which is often described as “the ground-truth forwarding behavior” in SDN literature, rather than infer it from the configurations of the switches. NetSight follows this approach to gather packet histories, but it modifies the existing flow rules in the network to emit postcards from every switch that the traced packet traverses. A postcard contains information about the traced packet, switch, matching flow entry, and output port. The network controller uses these postcards to reconstruct the path of the packet. However, it incurs significant logging overhead of the trajectories of the packets.
In contrast, lightweight tools, such as PathQuery and PathletTracer provide methods to trace the packet trajectory in a network while minimizing the overhead of data collection. However, both tools trade off the data plane resources in order to collect only the necessary data for pre-determined queries. For example, PathletTracer utilizes “Precise Calling Context Encoding” (PCCE) to minimize the encoded bits in the packet header during the path tracing process, but ultimately requires a large number of flow rules, especially for datacenter networks, where there are many paths between each pair of edge nodes. A similar drawback occurs with PathQuery in terms of data plane resources.
CherryPick proposes a simplified technique for tracing packet trajectories in a fat-tree topology. It attempts to minimize the number of flow rules required for the tracing process. CherryPick exploits the fact that datacenter network topologies are often well-structured, so it assigns each network-link a unique identifier. These link identifiers are inserted selectively into the packet header along its path using a VLAN tag. Although CherryPick picks a minimum number of essential links to represent an end-to-end path, it incurs high header space overhead. For example, it inserts three VLAN tags (i.e., an added 96 bits) into the packet header that traverses eight hops (e.g., due to failure along the shortest path) in a fat-tree topology. Moreover, it relies on the correctness of the network sub-netting scheme in the datacenter.
The “background” description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description which may not otherwise qualify as conventional at the time of filing, are neither expressly nor impliedly admitted as conventional against the present disclosure.