An autonomic computing system is a computing system that senses its operating environment, models its behavior within that environment, and takes actions to change the environment or its own behavior. Autonomic computing systems are typically self-configuring, self-healing, self-optimizing, and self-protecting. They are self-configuring in that such systems have characteristics that enable them to adapt to changing conditions by changing their own configurations, and have functionality that permits the addition and removal of components or resources within the system without service disruption.
Autonomic computing systems are self-healing in that they have the capacity to recognize and diagnose deviations from normal conditions and to take actions to normalize the conditions, as well as the capability to proactively circumvent issues that could cause service disruptions. Such computing systems are self-optimizing in that they have the ability to monitor their state and performance and proactively tune themselves to respond to environmental stimuli. Autonomic computing systems are self-protecting in that they incorporate intelligence to recognize and circumvent security threats, and have the facility to protect themselves from physical harm, such as excessive heat or motion.
FIG. 1 shows the architecture 100 of an aspect that is present in some autonomic computing systems within the prior art, a monitor-analyze-plan-execute (MAPE) loop 102. A manager component 104 performs the MAPE loop 102 in relation to a managed element 106 of a computing system. Based on knowledge 108, the manager component 104 performs four actions: monitoring 110, analysis 112, planning 114, and execution 116. It is noted that the MAPE loop 102 is not performed within the prior art for an autonomic computing system as a whole, on a system-wide basis, but rather is performed individually for each subsystem, or managed element, like the managed element 106.
The monitoring 110 includes receiving one or more events from or regarding the managed element 106. The analysis 112 includes correlating and analyzing these events to determine if a specific known situation exists. The planning 114 includes determining how, if such a situation does exist or is otherwise detecting, the situation should be processed or handled via determining one or more actions. The execution 116 thus includes the managed element 106 performing these actions that have been determined.
FIG. 2 shows how the MAPE loop 102 is performed on a subsystem level within an autonomic computing system 200, in accordance within the prior art. The computing system 200 includes a number of subsystems 202A, 202B, . . . , 202N, collectively referred to as the subsystems 202. A MAPE loop 102 is performed at or for each subsystem 202, on a subsystem-by-subsystem basis, where each subsystem 202 can be considered a managed element 106 as in FIG. 1. Thus, each subsystem 202 has the MAPE loop 102 performed as to itself, and not in relation to the other subsystems 202 of the computing system 200.
Such individual performance of the MAPE loop 102 on a subsystem-by-subsystem basis is problematic, however. Individual MAPE loops 102 may make wrong decisions as to what actions for the subsystems 202 to take because they are not informed by events occurring at all the other subsystems 202. For example, a MAPE loop for a processor subsystem may determine that the subsystem should step down its performance level to save energy, even though other subsystems 202 have determined that there are processes which will soon require significant processing time by the processor subsystem. Thus, as soon as the processor subsystem steps down its performance level, it may then be requested to immediately perform processing activity, such that the subsystem may immediately have to step back up its performance level to handle these requests. This can be problematic, because it may take time for the processor subsystem to step back up its performance level once the performance level has been stepped down.
Furthermore, individual performance of the MAPE loop 102 on a subsystem-by-subsystem basis can result in undesirable latency effects that cause performance degradation and energy conservation degradation as to the computing system 200 a whole. For example, all the subsystems 202 may have stepped their performance level down due to minimal activity being performed by the system 200. Requests that will result in significant activity to be performed by the system 200 as a whole may then be received by a front-tier subsystem. Ideally, once the front-tier subsystem receives these requests, all the subsystems 202 immediately step up their performance levels in anticipation of the requests travelling from the front-tier subsystem, to the middle-tier subsystems, and finally to the back-end subsystems.
However, where the MAPE loop 102 is individually performed on a subsystem-by-subsystem basis, this cannot occur. Rather, the front-tier subsystem first steps up its performance level in response to its own MAPE loop 102 being performed, which takes time before the front-tier subsystem can adequately process the requests. When the requests are transferred to each tier subsystem down the line, this process is repeated, with the tier subsystems having to step up their performance levels sequentially until they are able to adequately process the requests. Where each of N subsystems 202 has the same latency n to step its performance level back up after performance of its MAPE loop 102, this means that processing a request after all the subsystems 202 previously have had their performance levels stepped down results in a delay of N times n, which can be significant. That is, the latency n ripples across the N subsystems 202 sequentially. The ideal case, by comparison, is that all the relevant subsystems 202 are stepped up at the same time, so that just a latency n occurs overall.
The same latency ripping occurs in the reverse situation within the prior art as well. When all the subsystems 202 are operating at stepped-up performance levels, the back-tier subsystem may first step its performance level down in response to less activity being performed, as detected by the MAPE loop 102. Responsive to this, each middle-tier subsystem in sequence may then step its performance level down, until finally the front-tier subsystem steps its performance level down. As before, where each of N subsystems 202 has the same latency n to step its performance level back down after performance of its MAPE loop 102, this means that reaching an energy-efficient state for the computing system 200 as a whole results in a delay of N times n, because the latency n ripples across the N subsystems 202 sequentially. The ideal case is again that all the relevant subsystems 202 are stepped down at the same time, so that just a latency n occurs overall.