In the past decades information technology (IT) systems have evolved and increased in complexity. Many years ago a company would use a single computer with a single operating system and small number of programs to supply the computational needs of the company. Nowadays enterprise companies may have hundreds and thousands of computers interconnected over a network. The company may use multiple servers and multiple databases to service hundreds and thousands of computers connecting to them. Essentially each layer of the IT system has evolved and become more complex to control and manage. In some cases multiple servers may be installed with identical software and load balancers may be used to regulate access to the servers. An average business system includes tens and hundreds of thousands of configuration parameters. For example Windows OS contains between 1,500 to 2,500 configuration parameters. IBM WebSphere Application Server has about 16,000, and Oracle Weblogic more than 60,000. If any of these parameters are misconfigured or omitted the change may impact proper operation of the IT system.
The dependence of IT systems on the configuration can have serious consequences, for example in April 2011 Amazon Web Services suffered a devastating event that knocked offline some of their clients for as much as four days. It turned out that a network configuration error made during a network upgrade caused the problem. In the past upgrades were rare and applied slowly to the client servers. Nowadays especially with the help of the Internet upgrades for some software packages may be released on a daily basis and even automatically applied. If a problem arises in response to an upgrade most systems are incapable of presenting an administrator with a list of changes let alone suggest what changes are the most probable cause of the problem.
It is thus desirable to improve the ability to avoid problems in IT system updates and day-to-day operation and to reduce the mean time to resolution (MTTR) for handling problems that still occur in the IT systems. The prevention of problems and reduction of the MTTR can help to prevent economic damage to the organization.