2.1 Field
The exemplary, illustrative, technology herein relates to systems, software, and methods for managing the operation of networks composed of various and disparate electronic devices. More particularly, the exemplary, illustrative, technology herein provides systems, software, and methods for automatically configuring and enabling network management and monitoring software and systems for managing and monitoring the operation of networks composed of various and disparate electronic devices. The technology herein has applications in the areas of network management, computer science, electronics, and electronic commerce.
2.2 Background
Computer network technology has experienced phenomenal growth over the past two decades, from the esoteric experimental defense-related projects known to only a handful of electronics and military specialists in the 1960s and 1970s, to the epicenter of the so-called dot-com stock market boom of the late 1990s. Today, tens, perhaps hundreds, of millions of people all over the globe rely on computer networks for their jobs, education, and entertainment. In the industrialized world, access to computer networks appears to be almost ubiquitous. Examples include not only the traditional TCP/IP-based networks, such as the Internet and home or office Local Area Networks (LANs), but also include building control networks for managing a building's internal environment, networks of sensors for monitoring air quality, factory floor automation, and combined communications systems connecting previously disparate systems. Non-traditional networks, such as those used for monitoring and control of factory automation or building systems are referred to collectively herein as “SCADA” networks. SCADA stands for “Supervisory Control And Data Acquisition”. SCADA network systems provide process supervisory control and data collection capabilities used to operate many industrial systems today. Industrial processes and machines are controlled by SCADA systems using industrial controllers such as programmable logic controllers (PLCs). In recent years, PLCs have become better integrated with TCP/IP-based networks, but often still require custom applications for control and management. Other industrial controllers have not migrated to TCP/IP due to various technical and other considerations. Thus, in general, the term “network” or “computer network” includes both “traditional networks,” i.e., those using TCP/IP and/or Simple Network Management Protocol (SNMP) protocols, and “non-traditional” networks that do not have either an SNMP (or other TCP/IP management stack), an SNMP Object ID (OID)-based management data hierarchy, or other aspects required for “traditional” network management functions to operate as understood by those having ordinary skill in the art. Typically, non-traditional networks use protocols such as Controller Area Network (CAN) bus, used in vehicles, industrial automation and medical devices and IEEE 488 or General Purpose Interface Bus (GPIP). The differentiation of traditional and non-traditional computer networks will be apparent to those persons having ordinary skill in the art.
As used herein, “network” or “computer network” includes both traditional and non-traditional networks as just defined. A “network” is a configuration of devices and software that are in mutual communication and can exchange information, including data and instructions. Such communication is accomplished by the presence of a direct physical connection between devices (i.e., wired communication) and/or indirectly by electromagnetic or other non-physically connected communication (i.e., wireless communication), using whatever protocols are extant between the two devices. A network can include arbitrary numbers and types of devices, systems, and applications, which, in some exemplary, illustrative, non-limiting embodiments, function in accordance with established policies. In some networks the devices, systems and applications comprising the network can change over time, as can their configurations, locations and other parameters as devices are connected or disconnected from the network whether purposely or inadvertently.
Examples of devices, systems, and applications that can comprise a traditional network consistent with the technology described herein include, without limitation:                Traditional network infrastructure devices such as routers, switches, and hubs;        Traditional networked computing assets, such as mainframes, servers and workstations;        Traditional network links, including dedicated and dial-up connections and related devices (e.g., Digital Subscriber Loop (DSL) connections, modems, concentrators, and the like);        Industrial devices, such as those controlled by programmable logic controllers (PLCs), embedded computers, or other controllers that can support traditional network protocols;        Network services, such as Simple Object Access Protocol (SOAP)-based application servers, web services, network infrastructure services such as Domain Name System (DNS) and Dynamic Host Configuration Protocol (DHCP), and file sharing services;        Applications, such as databases (e.g., those sold commercially by Oracle (Redwood City, Calif.), IBM (Armonk, N.Y.), and Microsoft (Redmond, Wash.)), e-mail systems (e.g., sendmail, POP/IMAP servers); customer relationship management (CRM) systems, and enterprise management applications (e.g. those sold commercially by Oracle and SAP (Walldorf, Federal Republic of Germany));        Consumer appliances (e.g., “smart” cell phones, audio/visual equipment, network-connected home lighting controllers); and        Systems acting as “gateways” to non-traditional networks, that allow data to be transferred between traditional and non-traditional networks due to their connectivity to both types of network, and ability to use appropriate networking protocols for each.        
Examples of devices, systems, and applications that can comprise a non-traditional network consistent with the technology described herein include, without limitation:                Dedicated building control components, such as thermostats, furnace and chiller controls;        Vehicle vessel and aircraft control and communication systems        Medical device control and communication systems        Ladder logic controllers, such as those used to operate elevator or other systems;        Scales, flow or pressure gauges, tachometers, or other measurement devices;        Meters and other devices for the display of aspects of system status, usually in “real time”;        Sensors, including various types of embedded sensors and arrays of sensors, including RFID sensors, bar code readers or video scanners;        Industrial device controllers, such as PLCs, embedded computers, Coordinate Measuring Machines (CMMs), and similar devices when connected on non-TCP/IP based networks;        Data acquisition and Control networks, such as DeviceNet, CANopen, ModBus, VLXI, VME, IEEE 1394, and IEEE 488;        Process automation robotics;        Telephony-based networks, including analog and digital cellular networks;        Power grid networks for distributing electrical power        Consumer appliances (e.g., cell phones, audio-visual equipment, information kiosks); and        Dedicated infrastructure components (e.g. Private Branch Exchange (PBX's), automated dialers, and call routing systems).        
The network configuration can be either static (i.e., the devices that comprise the network do not change during network operation) or the configuration can be dynamic (i.e., devices may be connected to, or disconnected from, the network during operation). In some instances connection or disconnection of devices from the network can result in segmentation of the network, where some parts of the network lose connectivity with other parts of the network, while retaining connectivity between devices in each part (e.g. when a router device that connects two network segments is disconnected or fails, the two network segments lose connectivity with each other, but retain connectivity between devices within each segment).
Not only have computer networks become more common, but the complexity of these electronic webs has increased as well. Today, a network administrator must deal simultaneously with a myriad of different devices, manufacturers, network types, and protocols, as well as support the ad-hoc attachment and removal of devices from the network as portable wireless devices automatically connect and disconnect from the network infrastructure. Often the coordination among the developers of the software, hardware, and firmware of networked devices is loose at best. Devices must be able to communicate properly across the network without interfering with each other, but this is not always the case whether due to design, malfunction, misconfiguration, or misuse. In particular, administrators must be able to identify warnings and troubleshoot abnormal behavior on the network and network-attached systems before risk to network integrity or availability occurs. Non-traditional networks (e.g. CANbus, IEEE 488) and the devices connected to them are often used in real-time operation of SCADA systems, increasing the urgency that these networks and devices be effectively managed. Traditional management systems, i.e., management systems that are used to manage traditional computer networks, typically do not integrate with non-traditional networks and traditional management paradigms are generally not extensible to support non-traditional networks and devices.
To handle the growing network management workload, various network management devices (“NMDs”) have been developed, examples of which are described in the above-incorporated '407 and '125 Applications. By way of illustration, the network management device (NMD) of U.S. patent application Ser. No. 12/051,125 is a network appliance device, comprising hardware and software components, designed to flexibly operate upon traditional, SCADA, and Statistical Process Control (SPC) networks that are connected using a variety of transports, gateways, and networks. Traditional TCP/IP-based controllers, PC-connected or connections using gateway-style interface applications, and direct device control mechanisms are all supported using the same NMD. Various dynamic application(s) and templates interact with a collector installed upon the NMD to provide capabilities to interact with both traditional and non-traditional networks, either individually or in hybrid networks that combine both traditional- and non-traditional networks. The collector relies upon pre-installed network interface software present on the NMD to permit access to network data and devices through the NMD's network interface hardware. The collector and other NMD software also rely on known NMD operating system capabilities, such as input/output (I/O) libraries or services, and inter-process communication capabilities for access to non-volatile data storage systems, such as file systems on disk drives or relational database servers, to store collected data, retrieve dynamic application programs and templates, and other data resources necessary to the functioning of the NMD. NMD software is created with an understanding of the capabilities of the NMD hardware, such as CPU type, processing power, RAM memory capacity, I/O throughput, non-volatile storage capacity, number and type of network interfaces, so as to guarantee that NMD software can execute and provide the required level of performance to adequately monitor the network it is installed upon.
Current non-NMD network management systems are often complex and do not operate well for most users. First, these systems can have onerous deployment and operation requirements. Many require specialized expertise just to install and configure the network management software and additional applications. Others require additional expertise-based configurations of the software and applications to monitor a network, including: complex collections of vendor-specific applications to monitor disparate hardware and software and extensive custom programming to monitor applications.
Second, many non-NMD management systems can monitor only a limited number of attributes per network connected device, use a single network management protocol, or do not monitor system, application status, network performance, or quality of service (“QoS”) attributes. Furthermore, many non-NMD management systems do not cross-correlate between multiple network services and check for discrepancies between network services that provide coordinated services. Moreover, many network management systems are designed under the presumption that the network infrastructure is always functioning; and therefore may not be reliable when network service interruptions or degradations occur. Even the NMD has limitations in this area, since it can only monitor the network as seen from the point it is connected to, and when failures or misconfiguration of network components results in breaks in the network topology, the NMD can no longer monitor those network segments on the far side of a break in the network's connectivity. In many cases, management systems are tied to particular hardware devices, such as “sniffers” (e,g. the Portable Analysis System, sold by Network Instruments of Minneapolis, Minn.), or the NMD referred to above. Having a specific hardware component in the management system simplifies the initial deployment of the system, but places limitations on speed and flexibility of response to changes in network configuration, such as temporary network partitioning due to router failure or misconfiguration, and can involve other adverse factors such as expense, delay, and infrastructure requirements (e.g. space, power and cabling) when networks grow, change topology, or experience changes in traffic load, whether temporarily or permanently.
Third, the day-to-day operation of most current network management products requires skilled network operations staff to configure and maintain the management software and network, including adding and removing devices and device configurations as the network topology changes. Configuration typically requires that the staff manually collect information about network management applications (and management information base (“MIB”) configurations) used to manage the devices that are part of the network from individual device manufacturers, manually install and configure the software, and then manually set the thresholds for sending alerts. Many network management systems and applications are limited to using a single management protocol, for example, the Simple Network Management Protocol (“SNMP”), to collect information from devices, forcing the network operators to reconcile SNMP requirements with their management policies. Furthermore, the tools available to accomplish these tasks are primitive, often overloading network operators with excessive reporting responsibilities and failing to support automatic correlation of information about devices present on the network. For example, limitations in SNMP architecture force network operators to manage networks of devices from a single management station, or clear the same error reports from multiple terminals. Often, network devices only report their own internal status; but do not provide a network operator with critical information on the status of the device's communication with the network, nor do they provide information regarding the status of applications and services operating on the device.
Current network management systems are typically not responsive to degradations in network performance. They do not adjust their own use or monitoring of the network to alleviate or troubleshoot network issues that might be resulting from hardware failures, denial of service (DoS) attacks, ill-advised changes in network topology, spikes in network usage levels, or breaks in network connectivity.
FIG. 1 displays a diagram of an exemplary prior art network (1000) that includes an NMD (1080) as well as a number of other devices of various types, such as mainframe computers (1015), desktop computers (1010), file servers (1025), and printers (1020). Network 1000 includes a plurality of network segments (1060, 1060′) connected by various technologies, such as Ethernet (1045), or Token Ring (1040), sometimes separated by firewalls (1070, 1070′) and with links to a larger network (1090), such as the Internet, where additional devices such as wireless networking devices (1050) and wireless mobile devices (1030) can exist that can connect with the devices of the managed network segments. Those with skill in the art will realize that the depicted network is exemplary only, and that many configurations of the devices shown, as well as other devices not shown, are possible.
In such networks it is possible to form connections between devices on a first network segment (1060) and devices on a second network segment (1060′) for some purposes while being unable to monitor network or device state or traffic on the second network segment from the first network segment due to the restrictions imposed by firewalls or other limiting devices. For example, continuing with FIG. 1, if the NMD (1080) detects data communication between a first device (1065) on its segment (1060) and a second device (1065′) on the firewalled segment (1060′), this discovery can result in a desire to monitor network use and device status of the second device (1065′), but the firewalls (1070 and 1070′) block all traffic except that involved in the link between the first device (1065) and the second device (1065′). Using NMDs (1080) to monitor the second device (1065′) requires physically connecting the NMD (1080) to the second device's network segment (1060′), which can preclude continued monitoring of the first device's network segment (1060) (depending on the specifics of the firewall restrictions), and might involve relocating the NMD into physical proximity to the second device's network segment, perhaps over a great distance (in the example depicted, from Washington, D.C., USA to Tokyo, Japan). This can result in both lengthy delays and expenditures of money. Alternatively, a second NMD (1080′) can be procured and installed on the second device's network segment (1060′). This would permit simultaneous monitoring of both network segments, but still involves an expenditure of time and money, and may not be practical when there are a large number of network segments and a small budget for network monitoring, or if some network segments are located in areas lacking required resources, such as space, power or management personnel.
As depicted in FIG. 2, an exemplary prior art NMD (2000) is a network appliance device made up of dedicated hardware and software systems that work together to monitor and manage a network and the devices connected to it. Often such prior art NMDs self-configure once connected to a network through an auto-discovery mechanism using both passive and active techniques to detect, identify, configure and monitor other network devices using embedded and dynamic applications (2400), as well as optionally providing preintegrated applications (2500) such as Domain Name System (DNS), Dynamic Host Configuration Protocol (DHCP), and other such services as required. The exemplary prior art NMDs also provide a user interface (Device Interface) (2200) to the prior art NMD (2000) so as to allow control and configuration of the device (with configuration information stored in a Configuration Policy (2060)), examination of the data collected, and other required tasks, to generate reports, to receive alerts and traps as required, and can provide storage (2810) for collected monitoring data (2814) and configuration data for various devices or device types (2816 or 2812) as well as management of the available data storage resources (2800). The prior art NMD additionally has an Operating System (2100) to manage processes and resources of each discovered device in conjunction with a device manager (2050), communications interfaces (2600) for publishing (2620) and receiving (2610) information, a Maintenance Scheduler (2900) for performing periodic or timed activities, and an Error Handler (2910) for dealing with various error conditions. Detection and recognition of other devices, as well as monitoring, is performed by a Recognizer (2700), consisting of a Collector (2720) and its plug-in applications (2730), and three manager functions which manage dynamic applications (2710), Templates (2750) that describe various devices, device types, and events (2740).
The above-described exemplary prior art NMDs cannot be easily, inexpensively, or quickly replicated to deal with network growth, or be flexibly and dynamically deployed to continue monitoring activities during partial network outages or device failures (including failures of the prior art NMD hardware itself), or relocated to monitor isolated network segments, such as those on the opposite side of a router or switch, without expenses for additional hardware, transport and staff time. Furthermore, such prior art NMDs do not provide automated control and specification of flexibly deployable data collection and device management mechanisms, the specification of a flexibly deployable data storage and retrieval mechanism, or automatic adjustment of a prior art NMD's behavior, its data collection and handling mechanisms, and dynamic application behavior, or use based on network environment factors such as current traffic load, network outages, device failures, or DoS attacks. Furthermore, prior art NMDs do not support flexible trust configurations so as to allow monitoring and management of a given network by a plurality of entities (e.g. IT departments, ISPs or network support companies) without permitting all to have full access to the network and related data.
Thus, there is an immediate need for network management systems that are more robust, and simpler to install, configure, and maintain, which are responsive to changes in network performance so as to maintain a desired Quality of Service (QoS), and of monitoring even if the network topology is disrupted or is unstable. The exemplary illustrative non-limiting technology described herein meets these and still other needs.