Field of the Invention
This invention is in the field of communications network management methods and devices.
Description of the Related Art
Modern, packet-based digital, communications networks are typically large, complex and geographically distributed. Such networks need to be monitored and managed. To do this, typically network management equipment and protocols are used which enable remote observation and control of the various types of other equipment (e.g. computers, servers, routers, and switches) that may both make up the network or are attached to the network. Other network virtual assets such as applications, virtual machines, and virtual switches, also need to be monitored and managed.
These monitoring and management equipment and protocols are typically used for various applications such as to find or prevent network faults, understand or change network configuration, account for network usage, understand and ensure high network performance, and to ensure network security. In particular, Network Management Systems (NMS's) are usually the focal point of remote monitoring and management of these various physical and virtual network elements.
One problem that exists in such networks is that often a given managed element, such as a packet switch, or even a more complex device such as a server, may not be compatible with the network management protocols that a NMS wishes to use or that the features supported by the element may be insufficient for the NMS's task. This problem is particularly acute with OpenFlow switches, which to achieve speed have very limited computational capability, and thus may either not support popular network management protocols such as SNMP at all, or alternatively may only provide limited access to switch data through SNMP or related higher level network management protocols.
As mentioned above, implementing various types of standard network management protocols with devices like OpenFlow switches is particularly difficult. Such switches may have very high switching capacity (e.g. can often handle and process incoming network packets at the very high speeds or line rates, thus generally avoiding being a network bottleneck). Such switches achieve such high speeds due to their specialized electronic circuitry, but the design tradeoff is that they have a relatively small supervisory processor to handle management tasks like NMS requests, configuration tasks and control plane activity.
OpenFlow switches are described in McKeown et. al., “OpenFlow: Enabling Innovation In Campus Networks”, ACM SIGCOMM Computer Communication Review archive Volume 38 Issue 2, April 2008 Pages 69-74. As of Oct. 14, 2013, OpenFlow Switch Specification 1.4.0, provided by the Open Networking Foundation, defines the specification more precisely as of this date. Other work in this area includes Leong, US patent publication 2013/0272135; and Kawai, US patent publication 2013/0304915.
More specifically, in OpenFlow switches, for example, a main goal of the switch is to perform packet switching at the same line-speed as the incoming data packets, which will often be coming in at very high speeds. To do this, for OpenFlow devices, the design decision was made that to achieve such high speeds the actual “intelligence” of the switch is greatly limited. The switch, for example, can be viewed (in a very oversimplified manner) as comprising a flow-table type design that mainly comprises enough hardware functionality to compare the packet headers of incoming packets against a series of previously stored flow table entries, and to dispatch an incoming packet to an outward bound port depending on the corresponding flow-table instructions stored in the corresponding flow-table instructions for that particular flow-table entry. An OpenFlow switch can also do a few other functions, such as recognize exception packets and forward them on to an OpenFlow controller for further analysis and action, at a huge time penalty, but the point is that for this type of device, in order to achieve high speed, the intelligence of the device, such as intelligence useful to perform network diagnosis and monitoring, is greatly limited.
Another problem that often exists with higher functionality network managed elements, such as routers and servers, is that even if that element's processing capability is, in theory, sophisticated enough to support network management protocols such as SNMP requests at a low speed or low volume, in practice such devices may not have enough processing capacity to handle a higher volume of network management requests from one or more NMS's, while still performing their routine functions in a timely manner. That is, such devices may have to make an unhappy decision between either performing their basic network function, or responding to network monitoring requests, but can't do both functions well at the same time.
Further, in this and other situations, often the processor of a networked element may have other constraints, such as a fixed speed that cannot be scaled up if conditions warrant.
Another problem that exists in the current state of the art is that many switches and servers exist in a multi-tenant datacenter environment where they are shared by customers who each demand privacy. In order to provide customers with access to network management, the network management protocol must provide robust authentication, access control and privacy, which may be too expensive to provide on a network switch especially with multiple customers accessing the data simultaneously. Further, the data may need to be converted from a physical view to a virtual view that matches the customer's network service and this conversion might need the services of a general-purpose server and higher level customer data. Finally, security and stability requirements may dictate some separation between the customer's network management requests and the physical infrastructure.
In the past, workers in the field have tried to solve protocol translation, scalability and security problems in network management through the use of proxy agents or gateways. Such solutions sometimes solved the original problems but achieved limited adoption because they added complexity. In particular, it imposed a significant computational and management burden on network managers to have to remember and use a different network address for their network management requests that were other than the “real” address of the device that they would use for other functions. Further, if a proxy agent provided access to multiple managed devices, little-used and complex facilities like contexts and communities would have to be used to identify the managed device. Further still, critical auto-discovery and auto-configuration processes designed to learn of the existence and configuration of network elements could not perform these tasks in the presence of proxy agents, leaving important parts of the infrastructure unmanaged.
OpenFlow is an architecture that allows OpenFlow switch controllers to remotely control OpenFlow flow-table driven switches. The OpenFlow architecture does not provide a way for Network Management Systems, which primarily use SNMP, to monitor and manage the OpenFlow switches with SNMP. Further, it does not provide visibility into a number of pieces of information that managers need in order to efficiently manage OpenFlow networks.
Virtual services (including virtual switches) on virtual machines may be started or stopped or be moved to other physical locations, possibly requiring changed in addressing. It can be difficult to manage such a service directly when such virtual services, switches, or other machine's address is changing or when it is intermittently reachable.
Network elements are often monitored by polling techniques, in which the status of the network element is assessed whenever the network monitoring equipment issues a poll request. However the problem with such approaches is such polling methods can thus omit important information about conditions or events that have occurred between polls. Further, there may be important information a given network element doesn't return via polling.
Thus, again returning to our OpenFlow example, when one attempts to use polling methods to monitor an OpenFlow switch using the prior-art OpenFlow Protocol—critical resource shortfalls in between polls may not be noted. Indeed in this situation, there is a lot of other useful and important network monitoring information about the switch's interaction with its controller that also cannot be obtained using such prior-art approaches.