1. Field of the Invention
The present invention relates to managing exceptions in multiprocessor operations.
2. Description of the Related Art
In certain computing environments, multiple host systems may communicate with a control unit, such as an IBM Enterprise Storage Server (ESS)®, for data in a storage device managed by the ESS receiving the request, providing access to storage devices, such as interconnected hard disk drives through one or more logical paths (IBM and ESS are registered trademarks of IBM). The interconnected drives may be configured as a Direct Access Storage Device (DASD), Redundant Array of Independent Disks (RAID), Just a Bunch of Disks (JBOD), etc. The control unit may be a multiprocessor type system. For example, the control unit may include duplicate and redundant processing complexes, also known as clusters, to allow for failover to a surviving cluster in case one fails.
There are various types of multiprocessor systems. In one type, processors may each have their own memory and cache. The processors may run in parallel and share disks. In one type of multiprocessor system, each processor may run a copy of the operating system and the processors may be loosely coupled through a Local Area Network (LAN), for example. Communication between processors may be accomplished through message-passing.
In another type of multiprocessor system, the processors may be more tightly coupled, such as connected through a switch or bridge. Communication between the processors may be accomplished through a shared memory, for example.
In yet another type of multiprocessor system, only one copy of the operating system may run across all of the processors. These types of multiprocessor systems tend to be tightly coupled inside the same chassis with a high-speed bus or a switch. Moreover, the processors may share the same global memory, disks, and Input/Output (I/O) devices.
In the execution of instructions, various conditions, errors or external signals may arise which are often referred to as “exceptions.” In many processor architectures such as the PowerPC® processor marketed by IBM Corporation, the processor typically changes to a supervisor state to handle the exception. In addition, information about the state of the processor prior to the occurrence of the exception may be saved to certain processor registers and the processor begins instruction execution at a memory address which is typically predetermined for each type of exception.
The processor may store information about the state of the processor in a “scratch pad” portion of the memory which may be reserved for use by the exception handler. The address of the scratch pad memory may be stored in a “scratch register” to indicate to the exception handler, the location of the scratch pad memory.
The predetermined memory address at which an exception handler is stored is often referred to as an “exception vector.” Thus, for example, a “data storage interrupt (DSI)” exception may occur when a data memory access cannot be performed for some reason. A DSI exception may be assigned an exception vector of 0x00300, for example. Accordingly, instruction code to handle a DSI exception is stored in physical memory address 0x00300. Upon occurrence of a DSI exception, the processor begins executing the instruction code at physical memory address 0x00300 to handle the DSI exception.
In many processors, the particular exception vector assigned a particular exception may be set by the processor architecture and thus may not be readily definable by the user. For example, the exception vector may be defined by processor hardware, firmware or some combination of hardware and firmware. Thus, the architecture-defined exception vector is frequently not readily modifiable by the user.
In some processors, the architecture may permit a limited modification of an architecture-defined exception vector. For example, an architecture-defined exception vector may be derived using as an architecture-defined offset vector which defines an offset address. This offset address is added to an architecture-defined physical base address indicated by a processor register which may be set by the user. Thus, for example, a DSI exception may have a hardware-assigned vector offset address of 0x00300. If a particular bit is clear in the appropriate processor register, the physical base address may have a hardware-assigned value 0x00000 such that the processor begins executing the instruction code for the DSI exception at the summed physical address of 0x00300. However, if the particular bit is set in the appropriate processor register, the physical base address may have a hardware-assigned value 0xFFF00000, for example, such that the processor may begin executing the instruction code for the DSI exception at the summed physical address of 0xFFF00300, for example.
Other examples of exceptions include a “machine check” exception in which instruction processing may be suspended, an “instruction storage interrupt” (ISI) exception which occurs upon various failures to fetch the next instruction, an “external interrupt” exception which is signaled to the processor upon assertion of an external interrupt signal, an “alignment” exception which occurs upon various failures to perform a memory access, a “program” exception in which an attempt may be made to execute an improper instruction, and a “system call” exception which occurs upon execution of a system call instruction. Other types of exceptions may arise, depending upon the particular processor and the application of that processor.
Each exception may have an architecture-defined exception vector to the exception handler code for that exception. Thus, in a multiprocessor system, one processor upon encountering an exception, may begin executing the exception handling code at the architecture-defined exception vector for that exception. In executing the exception handling code at the exception vector, the processor may access and configure resources shared by the other processor or processors of the system. These shared resources may include processor busses, memory controllers, bridges, interrupt controllers, memory, etc.
Another processor of the system upon encountering an exception, may be directed by the same exception vector. As a result, a second processor may begin executing the same exception handling code as the first processor and may begin accessing and configuring the same shared resources as the first processor. Such a condition may cause a conflict that could disrupt the operations of the system, particularly if the exception handling operations of the two processors overlap.
In some multiprocessor systems, the various processors of the system may be executing the same operating system which is designed for multiprocessor operation. Hence, the common operating system may be designed to avoid conflicts notwithstanding more than one processor executing the same exception handling code at a particular exception vector.
In other multiprocessor systems, each processor of the system may be executing a different operating system. One approach in such a system is to rewrite portions of the various operating systems to accommodate situations where more than one processor is executing the same exception handling code at a particular exception vector. Another approach is to provide hardware to coordinate the operations of the various processors to reduce conflicts.