The invention relates generally to methods and systems for linking components of a service available via a network to assess the health of the service and to diagnose problems associated with the service. More particularly, the invention relates to methods and systems for generating a hierarchical structure of a network service.
Originally, computer networks were designed to have a centralized network topology. In such a topology, a centralized mainframe computer is accessed by users at computer terminals via network connections. Applications and data are stored at the mainframe computer, but may be accessed by different users. However, a current trend in network design is to provide a topology that enables distributed processing and peer-to-peer communications. Under this topology, network processing power is distributed among a number of network sites that communicate on a peer-to-peer level. Often, there are a number of servers within the network and each server is accessible by a number of clients. Each server may be dedicated to a particular service, but this is not critical. Servers may communicate with one another in providing a service to a client.
Networks vary significantly in scope. A local area network (LAN) is limited to network connectivity among computers that are in close proximity, typically less than one mile. A metropolitan area network (MAN) provides regional connectivity, such as within a major metropolitan area. A wide area network (WAN) links computers located in different geographical areas, such as the computers of a corporation having business campuses in a number of different cities. A global area network (GAN) provides connectivity among computers in various nations. The most popular GAN is the network commonly referred to as the Internet.
The decentralization of computer networks has increased the complexity of tracking network topology. The network components (i.e., xe2x80x9cnodesxe2x80x9d) may be linked in any one of a variety of schemes. The nodes may include servers, hubs, routers, bridges, and the hardware for linking the various components. Systems for determining and graphically displaying the topology of a computer network are known. U.S. Pat. No. 5,276,789 to Besaw et al. and U.S. Pat. No. 5,185,860 to Wu, both of which are assigned to the assignee of the present invention, describe such systems. As described in Besaw et al., the system retrieves a list of nodes and their interconnections from a database which can be manually built by a network administrator or automatically constructed using computer software. The system can be configured to provide any one of three views. An internet view shows nodes and interconnections of different networks. A network view shows the nodes and interconnections of a single network within the internet view. A segment view displays nodes connected within one segment of one of the networks. Selected nodes on the network, called discovery agents, can convey knowledge of the existence of other nodes. The network discovery system queries these discovery agents and obtains the information necessary to form a graphical display of the topology. The discovery agents can be periodically queried to determine if nodes have been added to the network. In a Transmission Controller Protocol/Internet Protocol (TCP/IP) network, the discovery agents are nodes that respond to queries for an address translation table which translates Internet Protocol (IP) addresses to physical addresses.
The Besaw et al. and Wu systems operate well for graphically displaying hardware components and hardware connections within a network. From this information, a number of conclusions can be drawn regarding the present capabilities and future needs of the network. However, the interdependencies of the components in providing a particular service are not apparent from the graphical display that is presented by the system. The complexities of such interdependencies continue to increase in all networks, particularly the Internet.
Another approach is described by J. L. Hellerstein in an article entitled xe2x80x9cA Comparison of Techniques for Diagnosing Performance Problems in Information Systems: Case Study and Analytic Models,xe2x80x9d IBM Technical Report, September, 1994. Hellerstein proposes a measurement navigation graph (MNG) in which network measurements are represented by nodes and the relationships between the measurements are indicated by directed arcs. The relationships among measurements are used to diagnose problems. However, the approach has limitations, since MNGs only represent relationships among measurements. A user of a system that is based on this approach must understand the details of the measurements (when, where, and how each measurement is performed) and their relationships to the different service elements. This understanding is not readily available using the MNG approach.
The emergence of a variety of new services, such as World Wide Web (WWW) access, electronic commerce, multimedia conferencing, telecommuting, and virtual private network services, has contributed to the growing interest in network-based services. However, the increasing complexity of the services offered by a particular network causes a reduction in the number of experts having the domain knowledge necessary to diagnose and fix problems rapidly. Within the Internet, Internet Service Providers (ISPs) offer their subscribers a number of complex services. An ISP must handle services that involve multiple complex relationships, not only among their service components (e.g., application servers, hosts, and network links), but also within other services. One example is the web service. This service will be described with reference to FIG. 1. Although it may appear to a subscriber of the ISP 10 that the web service is being exclusively provided by a web application server 12, there are various other services and service elements that contribute to the web service. For instance, to access the web server 12, a Domain Name Service (DNS) server 14 is accessed to provide the subscriber with the IP address of the web site. The access route includes one of the Points of Presence (POP) 16, a hub 18, and a router 20. Each POP houses modem banks, telco connections, and terminal servers. A subscriber request is forwarded to and handled by a web server application. The web page or pages being accessed may be stored on a back-end Network File System (NFS) 22, from which it is delivered to the web server on demand. When the subscriber perceives a degradation in the Quality of Service (QoS), the problem may be due to any of the web service components (e.g., the web application server 12, the host machine on which the web application server is executing, or the network links interconnecting the subscriber to the web server), or may be due to the other infrastructure services on which the web service depends (e.g., DNS or NFS). The ISP system 10 of FIG. 1 is also shown to include an authentication server 24 for performing a subscriber authentication service, a mail server 26 for enabling email service (for login and email access), and front-end and back-end servers 28, 30 and 32 for allowing Usenet access.
Subscribers demand that ISPs offer reliable, predictable services. To meet the expectations of subscribers and to attract new subscribers, ISPs must measure and manage the QoS of their service offerings. This requires a variety of tools that monitor and report on service-level metrics, such as availability and performance of the services, and that provide health reports on the individual service components. Unfortunately, the majority of management systems have not kept pace with the service evolution. Available management systems lack the capability to capture and exploit the inter-relationships that exist among services available in a network environment, such as the Internet.
Each network is unique in various respects, such as the con-figuration of servers, the types of application servers, the service offerings, the organizational topology, and the inter-service dependencies. Therefore, in order to accurately understand the operations of the network, specific models must be crafted for the services provided within the network. Regarding an ISP system, handcrafting models of services available via the ISP requires an enormous effort on the part of a human expert or group of experts. Consequently, inter-service dependencies are typically not fully recognized, even by the experts.
What is needed is a method that facilitates construction of a model of a core service available via the network, including inter-service dependencies that are relevant to providing the core service.
A method and system for modeling a selected service that is available via a network includes utilizing a service model template as a basis for generating a service model instance of the selected service. The service model template anticipates -network elements and network services that cooperate in execution of the selected service. The service model template is specific to the service, but is independent of any particular computing environment. When the template is combined with discovery information that is specific to the actual network elements and the actual network services of a particular computing environment, the service model instance is generated. That is, the service model instance is the realization of the template for a particular set of elements, services and inter-dependencies in a specific computing environment.
The discovered instance information that is specific to the actual network elements and actual network services may be acquired using any of a variety of techniques. However, in the preferred embodiment, the discovered instance information is determined using auto-discovery techniques without requiring human involvement. Moreover, the step of generating the service model instance is preferably executed in computing programming and includes accessing at least one memory store in which the service model template and the discovered instance information are stored.
Also in the preferred embodiment, the service model template includes associations between measurable performance parameters (e.g., availability and delay) and the network elements and network services. The step of generating the service model instance may then include indicating states of various network elements and network services based on the measurements, with the states being indicative of the xe2x80x9chealthsxe2x80x9d of the elements and services. Once generated, a service model instance can be used to support management functions, such as operational monitoring, capacity planning, customer support, and service-level contact management.
While not critical, the method may be utilized to model a service provided by an Intranet or Internet Service Provider (ISP). The service model template may be formed to be substantially generic to ISPs, but the service model instance will be specific to the ISP of interest. Depending on the service being modeled and the network elements that are anticipated to be involved, the service model template defines nodes of various element types (e.g., hosts, servers, network links and services) and their associated measurements. Moreover, the template indicates the dependencies among the nodes, such as the dependency of the selected service on other services (e.g., the Read Mail service depends on the authentication and NFS services). The template preferably also includes default state computation rules for specific nodes, so that the state of a particular node can be based upon measurements associated with the node and upon states of dependencies of the node. In the embodiment in which the discovered instance information is acquired using auto-discovery, the process includes a discovery template. The discovery template identifies particular discovery modules that are invoked to discover elements and services of designated types. The discovery template also identifies each dependency of a discovery module on other discovery modules. Thus, the discovery template is used to orchestrate the process of detecting the network elements and network services that are relevant to the selected service. The discovery template may also be used to format the outputs of the discovery modules.
The system for modeling the selected service includes memory for storing the service model template, a discovery engine for initiating operations specific to identifying the relevant network elements and services, and a model creation engine that maps the discovered network elements and services into a framework identified in the service model template. Preferably, the system includes a view generator that provides a graphical display of the service model instance. The display may be a hierarchical graph of nodes that are inter-linked to indicate inter-dependencies among the nodes. Color coding or other schemes may be used to identify the states of the nodes. The view generator is configurable to enable different views, so that personnel having different domains of interest may focus upon different aspects of the service model instance.
An advantage of the invention is that a service model instance that is tailored to a particular computing environment is generated without requiring complete handcrafting by a highly skilled individual. The customization accounts for differences in the number of services involved in providing a service, geographical locations of services, differences in inter-service dependencies, etc. The use of a service model template renders it easier to modify an existing service model instance. By editing the service model template, an ISP can add a new element type service model instance and cause changes to the attributes of all nodes in the service model instance that represent a specific element. Without a service model template, such changes would be difficult to incorporate and may even require extensive changes to the software that generates a service model instance. Moreover, the task of correlating measurements to deduce root-causes of problems has been simplified. The service model instance encapsulates the knowledge of a human expert by reflecting the dependencies that may exist among services and among network elements. Thus, the service model instance represents the structure of the selected service. By traversing the service model using logic to interpret the state of the different services and elements represented in the model, the root-cause of a problem is easily identified.