Information and the means to exchange information via computing technology have grown to be sophisticated and complex compared to the state of the art a mere 15 years ago. Today, computers have become critical to the efficient function and conduct of business in numerous sectors worldwide, ranging from governments to corporations and small businesses. The increasingly critical role of computing assets has, in turn, been the basis for concern from various sectors as to the reliability and manageability of computing assets. System downtime events resulting from hardware problems result in considerable expense to businesses in the retail and securities industries, among others. Moreover, with networked applications taking on more essential business roles daily, the cost of system downtime will continue to grow.
Another significant cost of system downtime is related to diagnosing and repairing a hardware-related problem with a system. Many computer systems provide only minimal diagnostic functions, and these generally only to the level of whether or not the system is running. Embedded diagnostic codes such as power-on self test (POST) exist within a computer system and can perform limited diagnostic tests automatically when a computer is powered up. The POST series of diagnostic tests performed varies, depending on the BIOS configuration, but typically POST tests the RAM (random access memory), keyboard, and access to every disk drive. If these tests are successful, POST initiates loading of the operating system and the computer boots. Otherwise, the fault area is reported/isolated for analysis. However, POST executes its diagnostic functions only upon power-up. POST is not capable of diagnostic monitoring during normal system operations.
Many diagnostic routines typically require a user to know the components of a system and load appropriate modules in order for diagnostic testing to function for all hardware elements of a system. These diagnostic routines do not contain self-managing or dynamic processes to discover failed hardware and permit identification of the system hardware problem without user intervention. In addition, many diagnostics routines cannot be run across partition boundaries, and many diagnostic routines effectively cannot run across a network and/or the Internet.
Currently there are built-in test modules, but no stand-alone test modules commercially available that are able to run true diagnostics concurrent with normal system operation. This is because the computer's operating system (O/S) generally considers itself to “own” certain system resources, and thus prevents the stand-alone test module's diagnostics routine from involving device drivers and O/S cooperation in many of the diagnostic functional tests.
Therefore, what is needed is an improved methodology for diagnostic testing in computer systems which overcomes these problems, and provides for dynamic processes without user intervention.