The present invention relates to the field of network servers, more particularly to enabling a crashed computer system to communicate over a network where an external entity may remotely communicate with, diagnose, and repair the dead system.
Generally, when a computer system experiences a crash, halt, or other impairment, critical and important data may become corrupted. For example, when a server can no longer access certain critical data, that server""s state is considered compromised. When a compromised state is detected, the server system may block any further operations to prevent further data corruption. This is often called a server crash. This involves stopping all code execution and preserving the contents of the server""s memory at the state they are in when the critical error occurs. At this state, the server is unavailable to handle requests. Critical processes on the server are unable to initiate or complete execution. The server remains inoperative until the system crash is properly addressed and corrected. Such a server is often referred to as a dead server. Where time is of the essence, a system crash may result in irreparable consequences. While a server reboot may bring the system back to running condition, time, data and other information may be irretrievably lost.
Further, even if a server reboot is performed, if the system has impairments, server crashes could continue to occur. Server impairments, whether partial or total, cause downtime and lost productivity. In most cases, nothing can be done to keep the server in constant and proper operation until a resolution is found, which generally involves diagnosis and repair of impaired server conditions.
One element useful in diagnosing a server fault is a core dump. A core dump is a byte-for-byte image of a server""s memory, essentially a snapshot of a server""s RAM at a particular point in time. The process of copying an image of a system""s memory may be referred to as dumping core memory or making a core dump. When an error occurs on the system, the core dump may contain information about system activities and the state they were in when the critical error occurred. System activities may include processes, loaded modules, allocated memory, cache memory, screen shots, and other information. In the past, core dumps were done to disks or other transportable storage media. Due to the size of server memories, however, that technique is generally impractical. Further, that technique requires an operator to be physically present with the server to take the disk to be used for diagnosis.
Accordingly, techniques have been developed to enable a core dump to be performed over a network to a large storage device for diagnosis. Generally, these techniques involve running a LAN or other network driver from the server machine to download data from the server out over a LAN or other network.
A dedicated network driver may be hard-coded into an operating system debugger to be used when there is a system crash. However, this technique entails converting a user""s currently existing driver to handle the necessary debug-time constraints, and is limited to a specific network topology. With the large number of different types of drivers in commercial use, the conversion of each different type of driver into a debug environment is an unrealistic commercial option.
These and other drawbacks exist with current systems.
An object according to the present invention is to overcome these and other drawbacks with existing systems.
Another object of the invention is to provide a debug-time network environment to allow access to a server that has crashed (i.e., all processes have halted) and can no longer communicate over its regular network channels.
Another object of the invention is to enable users to utilize existing LAN drivers to create an environment on a server where packets of information may be transferred in and out, at the time of system failure.
Another object of the invention is to load a copy of a network driver already operating on the server into memory, dynamically link it into a debug-time network environment prior to a system crash, and keep it in reserve for use after a system crash to enable network communications with that server.
Another object of the invention is to provide an emulated operating system (xe2x80x9cOSxe2x80x9d) environment to enable a network driver to initialize, transmit, and receive network packets as well as other functions and operations, at the time of a system crash. The emulated environment may include network protocols and application interfaces so the network driver operates as if it were under control of the server operating system rather than an emulated module.
According to an embodiment of the present invention, a debug-time network environment is provided that enables access to a server system that has crashed where communication over regular network channels is restricted. The debug-time network environment may provide a module loader that may be independent of the run-time OS environment. The module loader may load a copy of a network driver into memory and dynamically link it into the debug-time network environment. The network driver may be one of the drivers operated by the server OS during run-time and normal operations. The copy of the network driver is loaded into memory prior to a system crash where it is kept in reserve. After a system crash, the final initialization of the copy of the network driver kept in reserve may take place. This enables the network driver to operate as it would under normal operating conditions. Other methods of invoking a copy of the network driver may also be employed.
The debug-time network environment also provides an emulated Operating System Application Program Interface (xe2x80x9cOS APIxe2x80x9d) where the emulated OS API enables the network driver to initialize, transmit and receive network packets. This emulated environment may also provide one or more network protocols and application interfaces to support debug-time network applications, such as system core dumps, remote debug access, crash diagnostics and repair, and other applications. By utilizing off-the-shelf network drivers, existing drivers do not have to be reconfigured or converted to support the communication to a network from a dead system. The ability to use off-the-shelf drivers provides flexibility and convenience.
Other objects and advantages of the present invention will be apparent to one of ordinary skill in the art upon reviewing the specification herein.