A data processing environment comprises a variety of hardware, software, and firmware networking components. A physical network, also called an underlay, is a network defined using such components.
Techniques are available presently to construct a logical network, also known as a software defined network (SDN) overlay (hereinafter “overlay” or “overlay network”), from such networking components. Essentially, networking components are abstracted into corresponding logical or virtual representations, and the abstractions are used to define the overlay. In other words, an overlay is a logical network formed and operated using logical representations of underlying networking components.
Physical networks usually exist within the demarcated boundary of the data processing environment whose networking components are utilized in the physical network. Unlike a physical network, an overlay can be designed to span across one or more data processing environment. For example, while a physical network may be contained within a datacenter, an overlay may span across one or more datacenters.
As an example, a logical representation of a networking gateway can participate in an overlay, such that a function attributed to the logical representation of the networking gateway in the overlay is actually performed by the underlying networking gateway component in the underlay.
In an overlay, because the actual networking components that perform the networking functions are abstracted into logical entities representing the networking functionality offered by those components and not the actual implementations of those functionalities, something is needed to direct that networking functionality into a functioning logical network. An SDN controller is a component that manages and operates the logical networking components within an overlay.
Henceforth in this disclosure, any reference to a component within the context of an overlay is a reference to a logical or virtual representation of the component, which participates in the overlay, unless expressly distinguished where the reference is made.
A virtual machine (VM) comprises virtualized representations of real hardware, software, and firmware components available in a data processing system. The data processing system can have any number of VMs configured thereon, and utilizing any number of virtualized components therein. The data processing system is also referred to as a computing node, a compute node, a node, or a host.
In large scale data processing environments, such as in a data center, thousands of VMs can be operating on a host at any given time, and hundreds if not thousands of such hosts may be operational in the data center at the time. A virtualized data processing environment such as the described data center is often referred to as a “cloud” that provides computing resources and computing services to several clients on an as-needed basis.
Network virtualization by defining overlay networks is an emerging trend in the management and operation of data centers and cloud computing environments. One of the goals of network virtualization is to simplify the network provisioning in multi-tenant data processing environments, as well as dedicated customer data processing environments.
Unicasting is a method of sending data point-to-point, to wit, from a single sender to a single receiver. Multicasting is a method of sending data from one or more sender data processing systems to several receiver data processing systems nearly simultaneously. Internet Protocol (IP) multicast is the process of multicasting IP packets to several receivers in a single transmission of the IP packet. IP multicast is a popular technique used to help conserve bandwidth in the data center and reduce the load on servers.
Hereinafter, the terms “multicast”, “multicasting”, “Mcast” when used alone refer to IP multicast unless distinguished specifically where used. The terms “multicast”, “multicasting”, “Mcast” when used as a prefix, a suffix, or in conjunction with another term or artifact, qualifies that term or artifact as being usable in IP multicasting within the context of the usage of the term or artifact, unless distinguished specifically where used.
IP multicast operating in an overlay network is called overlay multicast. Overlay multicast can be achieved in different ways, depending on the support for multicasting provided in the underlay network. Multicast based overlay multicast requires the underlay network to provide support for multicasting. Multicasting in underlay networks is not presently prevalent in data processing environments. Multi-unicast based overlay multicast is a method to transmit multicast packets in the overlay network where the underlay supports unicasting but does not support multicasting.
The illustrative embodiments recognize that presently, the multi-unicast based overlay multicast method requires the sender computing node of the data to unicast copies of the data to each intended receiver computing nodes. The illustrative embodiments recognize that the multi-unicast based overlay multicast method of multicasting is severely limiting. For example, a virtual switch in a computing node in the overlay is responsible for replicating the data into multiple unicast packets and transmitting each unicast packet individually. The multi-unicast based overlay multicast method of multicasting consumes a significant amount of resources of the computing node at least for the purposes of replicating and unicasting the data.
Furthermore, this method of overlay multicasting requires each computing node to be aware of the node's neighborhood in the data processing environment. In other words, each computing node has to know the identities of every other active computing node in the data processing environment and maintain a current listing of every node's preference whether that node is willing to receive multicast packets.
Each VM in each computing node can decide whether the VM wants to participate in multicasting. In commonly seen data processing environments, thousands of VMs can be operating on a computing node at any given time, and hundreds if not thousands of such nodes may be operational in the data processing environment at any given time. Furthermore, VMs are frequently created, reconfigured, or destroyed in data processing environments, and computing nodes are routinely brought online and offline. For each computing node to keep accurate and current records of all other receivers interested in multicasting is a monumental task, which requires a significant amount of computing resources—at each computing node.
The multi-unicast based overlay multicast method of multicasting is error-prone, maintenance-heavy, and a significant drain on computing resources in a data processing environment. Furthermore, the multi-unicast based overlay multicast method of multicasting is not a scalable method because of the explosive growth in the amount of information to keep up at each computing node with every addition or change of a computing node or a VM. The multi-unicast based overlay multicast method of multicasting is a work-around for multicasting in overlays but lacks the ability to meet the performance requirements in any sizeable overlay network.