In the past decades information technology (IT) systems have evolved and increased in complexity. In the past a company would use a single computer with a single operating system and small number of programs to supply the computational needs of the company. Nowadays enterprise companies may have hundreds and thousands of computers interconnected over a network. The company may use multiple servers and multiple databases to service hundreds and thousands of computers connected to them. Essentially each layer of the IT system has evolved and become more complex to control and manage. In some cases multiple servers may be installed with identical software and load balancers may be used to regulate access to the servers. An average business system includes tens or hundreds of thousands of configuration parameters. For example Windows OS contains between 1,500 to 2,500 configuration parameters. IBM WebSphere Application Server has about 16,000, and Oracle Weblogic more than 60,000. If any of these parameters are misconfigured or omitted the change may impact proper operation of the IT system.
The dependence of IT systems on the configuration can have serious consequences, for example in November 2014 Microsoft Azure Services suffered a devastating event that interrupted six availability zones in the U.S., two in Europe, and four in Asia for as much as 11 hours. It turned out that a configuration change had been introduced as part of an Azure Storage update to improve performance as well as reducing the CPU footprint. This change had been deployed to some production clusters in the previous weeks and was performing as expected. However, the configuration change exposed a bug resulted in the application to go into an infinite loop not allowing it to take traffic. Nowadays especially with the help of the Internet, upgrades for some software packages may be released on a daily basis and even automatically applied. If a problem arises in response to an upgrade most systems are incapable of presenting an administrator with a list of changes let alone suggest what changes are the most probable cause of the problem.
It is thus desirable to improve the ability to avoid problems in IT system updates and day-to-day operation and to reduce the mean time to resolution (MTTR) for handling problems that still occur in the IT systems. The prevention of problems and reduction of the MTTR can help to prevent economic damage to the organization.
A few companies have developed software products that help system administrators to keep track of computer configurations. These products detect values of granular configuration items (CI). Typically, such products collect and store the configuration items (CI) in a database so that the current value of a configuration item may be compared to prior values or to similar machines. The products may also bundle configuration items into composite CI's to enable easier visualization of the CI's, for example by grouping them by their type or content. Once the configuration items are collected an IT user (e.g. engineer, system administrator) may need to analyze hundreds, thousands or millions of configuration items to detect the source of a problem.