The present invention relates generally to providing management, maintenance, and support of both hardware and software on computers. In particular, it relates to a method and system that enable individual computers to be compared with peer group computers to speed up and to improve the quality of troubleshooting, maintenance, and support.
As the cost of PCs and servers falls lower and lower, and as more and more PCs and servers are placed into service within businesses and organizations of all kinds, the problem of managing the configurations of and diagnosing the problems within those computers, repairing them, upgrading them, and keeping them running becomes more and more difficult. Increasingly, particularly with respect to servers which are often used in very large numbers, the complexity and the cost of the service needed to keep computers running is coming to be an important issue.
In recent years, several steps have been taken to cut the cost of managing computers. For example, a user may now click on an icon and type out a “trouble” message on the screen of a PC or workstation. That message, together with a record of the configuration of the computer and the identity (name, telephone number, e-mail address) of the user is then automatically routed to a central site where service technicians are presented not just with the user's message but also with a detailed report of the current status of the computer. The service representative can then respond with an e-mail message, with a telephone call, or with a live, on-screen “chat.” The service representative may also take over control of the user's computer just as if the service representative were seated at the computer, rather then being at a central site many miles away.
Another advance has been the ability to have software data collectors installed on computers within an enterprise. These can run all manner of software (programs and script files) on each computer within an enterprise, gather all manner of data concerning how the computers are configured, and transmit records containing this data to a central site where sophisticated analyzers can sift through all of this data looking for anomalous conditions or other issues which can then be automatically reported in special reports. Centrally located auditors also may ask for the one-time execution of special sets of collectors to gather data for inclusion in special types of reports. Thus, the configuration and operative state of remotely-located computers can be determined quickly and in an automated fashion.
Computers can also be clustered into groups of computers that back each other up in a fully-automated fashion, with a computer that fails or that is not performing properly automatically switched out of service and replaced with another backup computer. This can keep critical services fully operative even when some computers are placed out of service because of technical problems. Computers can also be arranged to monitor themselves continuously, checking for problems, and reporting any problems developed in essentially the same manner described above whereby users report problems, but this process can be fully automated.
Still, the task of diagnosing the problems in a computer that is malfunctioning remains a difficult and time-consuming one, one that requires considerable ingenuity, and one that also requires considerable experience on the part of service personnel. When faced with a problem the solution to which is not obvious, service personnel frequently guess at possible causes and then try various fixes, continuing this process until a problem finally disappears. This may take a long time and may involve replacing hardware components or re-installing software components or installing software patches that were not actually needed, wasting both time and materials.
What is desired, for example, is some way to enable service personnel to take advantage of the expertise represented by the hundreds and thousands of computers that are operating in the field and that are properly configured, as is indicated by their generally acceptable performance. For example, if a first machine is malfunctioning and a second machine of the same type, having more or less the same system configuration, and performing the same business function in a similar industry is available to serve as a properly functioning model, then the configurations of the two machines, as well as their comparative performance, could be compared. Any differences between them would suggest possible causes of the malfunction. But comparing two machines in this manner is not without its risks, for any given machine might possibly be mis-configured even though it appears to be fully operative. And it is difficult to find a comparably configured computer to be used for comparative purposes. Accordingly, the present invention proposes new methods and systems for determining whether a computer is properly configured and performing normally.