1. Field of the Invention
The present invention generally relates to data processing and more particularly to handling error notifications in a system with dynamic partitioning.
2. Description of the Related Art
Logical partitioning refers to the ability to make a system run as if it were two or more independent systems. Each logical partition represents a division of resources in the system and operates as an independent logical system. Each partition is logical because the division of resources may be physical or virtual. An example of logical partitions is the partitioning of a multiprocessor computer system into multiple independent servers, each with its own processors, main storage, and I/O devices. One of multiple different operating systems, such as AIX, LINUX, and others can be running in each partition.
During operation of any system, errors inevitably occur. In a logically partitioned system, some errors (Local) are only reported to the assigned or owning partition's operating system. Failures of the I/O adapters which are only assigned to a single partition's operating system are example of such errors. Other errors (Global) are reported to all partition's operating system because such errors potentially affect the operation of multiple partitions. Examples of these types of errors are power supply, fan, memory, processor failures and the like. Global errors are typically broadcast to the logical partitions on a system by a partition manager, corresponding to lower-level code residing between the partitions and the hardware resources of the system.
While broadcasting errors works well in a static environment, such an approach is not suitable for a dynamically partitioned environment in which logical partitions are added and/or removed dynamically during operation of the system. In a system incorporating dynamic logical partitioning, a logical partition may report in shortly after an error notifications broadcast, thus missing the notification even though the partition may be affected by the error condition. A possible solution to this problem is to queue up all error notifications and broadcast only when a logical partition becomes active. However, if the partition is only activated sporadically, this approach may result in flooding the partition's error notification point with stale errors that are no longer relevant, due to the amount of time that has elapsed.
Therefore, there is a need for a system and method for error notification in a dynamically partitioned environment.