The advent of modern computing technologies, including processors, mass-storage devices, electronic memories, and digital communications, has spawned the development of a vast array of processor-controlled devices and systems during the past 60 years, including many different types of computing systems. As processor speeds, memory and mass-storage-device capacities, and digital-communication bandwidths have rapidly increased, early stand-alone computer systems have evolved into highly complex parallel-processing and distributed computer systems with many orders of magnitude greater computational bandwidths than early mainframe computer systems. Moreover, processor control has been introduced into many, if not most, of the many complex devices and systems used and produced in modern social and technological environments, including consumer devices, such as cameras, telephones, and television sets, to automobiles, airplanes, locomotives, power plants, traffic-control and power grids, machine tools, automated manufacturing plants, hospitals, and commercial establishments. Many of these complex systems, including computer systems, include control sub-systems implemented as large control programs, such as operating systems that control computer systems. The control programs may be specified by hundreds of thousands, millions, or more lines of high-level-language computer code that encodes complex and intricate control logic.
During operation, complex processor-controlled systems may inhabit many different operational states. These operational states may be characterized by various types of metrics, the values of which, at any particular point in time, may be considered as a state vector that describes the state of the complex system much like state vectors describe the state of physical systems in classical and quantum mechanics. In general, during normal operations, although a system may traverse a multitude of different states and may rapidly transition from one state to another, many systems tend to inhabit relatively small sub-spaces of the total state space which can be regarded as normal states. For example, when the rate of data traffic between a storage sub-system and a main system bus within a computer system is regarded as one of the state-defining metrics, the instantaneous rate of data exchange may constantly fluctuate, but the time-averaged data rate over seconds or minutes may generally inhabit a relatively small number of data-exchange-rate ranges correlated with a relatively small number of different types of computational loads, in turn correlated with higher-level cycles of system operation. For example, a computer system used within an automated factory may inhabit a high-computational-load state at the beginning of each production shift, a modest, steady-state computational load during each production shift, and a very low computational load between production shifts. The clusters of state-vector values which compose higher-level logical system states may depend on a plethora of system design characteristics, operational characteristics, and external environment characteristics, and may be difficult to predict in advance of system operation.
It is also often the case that systems may, for a variety of both predictable and unpredictable reasons, transition into abnormal, infrequently observed higher-level states, such as when system components fail, when an external power supply fails, and for many other reasons. Because of the immense size of the state space available to a complex system, it may be difficult for system operators and administrators, as well as for automated system-monitoring functionality, to recognize transitions from higher-level logical system states normally inhabited by a complex system to abnormal states. Quite often, entry of complex system into an abnormal state may precipitate a series of undesirable state transitions leading to ever more destructive abnormal states and to complete system failure. The earlier a transition from a normal higher-level logical state to an abnormal state is recognized, the greater the chance that timely interventions can be carried out to return the system to a normal state in order to prevent total failure and ameliorate damage. Designers and developers of complex systems, manufacturers of complex systems, those who maintain complex systems and, ultimately, users of complex systems all continue to seek methods and monitoring components that facilitate monitoring of the operational characteristics of devices and systems and detection of transitions of devices and systems into abnormal states.