A common arrangement for a computer system involves one or more client-server machines within a network. For example, a computer system may have thousands of servers operating in a number of different locations. The servers may be located in several different cities, or several different countries. However, it is desirable that the entire server system be manageable from a central location. Managing mission-critical servers in a distributed environment is difficult. In order to properly centralize server management in such a situation, the administrator at the central management location must be able to determine whether each of the servers in the system is running and working properly and whether each of the servers is performing the tasks required at the correct level of service. Additionally, if anything is beginning to deteriorate at any particular server, the administrator at the central management location needs to be informed in a timely fashion so that corrective actions can be taken. Ideally, the administrator should also be proactively warned that a particular server is deteriorating.
One conventional method of central server management involves utilizing management software on each server. However, this software analyzes only certain errors that may occur within certain parts of the server, and generates error logs. For example, in the International Organization for Standardization Open Systems Interconnection (ISO/OSI) model, there are seven layers. The available management software typically looks at only certain layers of the OSI stack. For example, some software monitors and maintains the hardware aspects of the server, while other software may only monitor the network or application layers of the OSI stack. This conventional approach limits the information gathered and does not always present the true status of the server. Another conventional method of central server management utilizes network management protocols. These protocols are low level protocols, such as SNMP (Simple Network Management Protocol), that can provide a basic indication of the health and performance of the server. Yet another conventional solution involves utilizing a monitoring and polling agent on each server. This agent sends back alerts to the central control console.
Each of the above conventional solutions presents drawbacks or disadvantages. For example, in order to monitor for a particular error, the type of error must be known. In other words, you must first know of the possibility of the error before you can monitor the server for the error. Additionally, the act of monitoring the server can create a large amount of monitored data that must be collected and sent to the central management location. The transmission of this monitored data can thus create a large amount of network traffic placing a strain on network communication resources. Further, each of the above conventional solutions utilizes the network link between the central management location and the server being managed. However, if the network link is down, possibly as a result of the server, then the ability to manage the server from the central management location is lost. Additionally, if the server suffers a catastrophic failure, the network connection will be lost. Moreover, once the server suffers the failure, any error logs or other data may be irretrievable. Moreover, the conventional software solutions focus only on certain layers of the OSI stack, rather than allowing access to the entire OSI stack.
Another problem arises if the operating system of the server is inoperable, or about to become inoperable, for some reason. If the server being monitored is showing signs of degradation, it may be desirable to perform a low-level reboot of the server. In this instance, under the conventional approach, someone must be sent to the location of the server to “reboot” the server and perform any required maintenance. In a server environment in which the server is located remotely from the central management location, this can result in a large amount of server down time as the technician travels to the server location.
Accordingly, there is a need for an effective system and method for managing a client from a central management location that allows access to all layers of the OSI stack. There is also a need for a system and method for managing a client from a central management location that does not rely upon the network link between the server and the central management machine to remotely manage the client. A need exists for a system and method for managing a client from a central management location that can be used to reboot and perform certain operations on a remote server computer from the central management location