1. Field of the Invention
Embodiments of the present invention relate to discovery functionalities for a cluster of nodes. More specifically, embodiments of the present invention relate to methods and systems for determining a suitable routing configuration for a fabric of a cluster of server on a chip (SoC) nodes that integrate processing and networking resources and for maintaining that routing configuration, associated network information, and the like.
2. Description of Related Art
Various forms of networks having a plurality of associated data processing nodes are well known. For optimal performance, usability, and reliability in such networks, it is important that there is a means to quickly and reliably determine efficient (e.g., least-cost) routes between nodes and between a node and entities outside the network. Furthermore, the status of routes needs to be maintained and adjusted over time to ensure continued performance and reliability in the face of errors or network congestion. These functionalities are broadly referred to herein as discovery functionalities.
Various approaches for addressing these discovery functionalities are well known. However, these known approaches have been implemented in environments that have substantial resources (e.g., processor capability, available memory, etc) to apply to the challenges and requirements associated with providing these discovery functionalities. As a result of these substantial resources that are available, it is common for these known approaches to use system resources (e.g., networking resources) that would otherwise be available for processing user information to implement these discovery functionalities.
A network switch addresses these discovery functionalities from only the networking side such that valuable networking resources are consumed in order to address these discovery functionalities. When addressing these discovery functionalities with a network switch, there are typically considerable hardware resources available such as memory and hardware that is specifically configured for addressing these discovery functionalities. However, the situation also exists where there is a limited ability to interact with systems whose communication links are being assessed through the discovery functionalities. In this regard, addressing these discovery functionalities with a network switch has considerable limitations in that the network switch doesn't have a partner on the other side of a communication link but has significant available resources in the way of memory and discovery-specific hardware.
Server network interfaces that a network switch can interact with are very limited in how they can respond. Accordingly, this limits approaches available to a network switch for implementing discovery functionalities. Examples of these approaches for implementing discovery functionalities include the network switch filling in a routing table by sniffing packets on each port and identifying which network interfaces are connected on which port, loops being identified through the snooping when a given MAC address (i.e. network interface) is seen on multiple ports, loop avoidance through the network switch calculating a spanning tree based on its knowledge of MAC address vs. port and eliminating links that result in loops, resource loop detection/avoidance in cases where resource loops typically do not affect a switch, and, in the case of a plurality of interconnected clusters (i.e., referred to as super-clusters), multiple network domains discovering and interacting through multiple switches based on standard protocols for how switches will inform each other about their domains.
A cluster of traditional servers addresses these discovery functionalities with networking and processing elements. As such, when addressing discovery functionalities in a traditional cluster, server-side processing power and its network are used to perform discovery tasks such as establishing routing, detecting loops, and the like. More specifically, when a discovery agent is being run on each of the servers in the cluster, it is often required that all or a considerable portion of the cluster's resources are powered up in order to perform actions associated with the discovery functionalities. This is undesirable from the standpoints of power consumption and system resource utilization.