Given the continually increased reliance on computers in contemporary society, computer technology has had to advance on many fronts to keep up with both increased performance demands, as well as the increasingly more significant positions of trust being placed with computers. In particular, computers are increasingly used in high performance and mission critical applications where considerable processing must be performed on a constant basis, and where any periods of downtime are simply unacceptable.
Increases in performance often require the use of increasingly faster and more complex hardware components. Furthermore, in many applications, multiple hardware components, such as processors and peripheral components such as storage devices, network connections, etc., are operated in parallel to increase overall system performance.
Along with the use of these more complex components, the software that is used to operate these components often must be more sophisticated and complex to effectively manage the use of these components. For example, multithreaded operating systems and kernels have been developed, which permit computer programs to concurrently execute in multiple “threads” so that multiple tasks can essentially be performed at the same time. For example, for an e-commerce computer application, different threads might be assigned to different customers so that each customer's specific e-commerce transaction is handled in a separate thread.
One logical extension of a multithreaded operating system is the concept of logical partitioning, where a single physical computer is permitted to operate essentially like multiple and independent “virtual” computers (referred to as logical partitions), with the various resources in the physical computer (e.g., processors, memory, input/output devices) allocated among the various logical partitions. Each logical partition executes a separate operating system, and from the perspective of users and of the software applications executing on the logical partition, operates as a fully independent computer.
With logical partitioning, a shared program, often referred to as a “hypervisor” or partition manager, manages the logical partitions and facilitates the allocation of resources to different logical partitions. For example, a partition manager may allocate resources such as processors, workstation adapters, storage devices, memory space, network adapters, etc. to various partitions to support the relatively independent operation of each logical partition in much the same manner as a separate physical computer.
In both logically-partitioned and non-logically-partitioned computer systems, the management of the peripheral hardware components utilized by such systems also continues to increase in complexity. Peripheral components, e.g., storage devices, network connections, workstations, and the adapters, controllers and other interconnection hardware devices (which are referred to hereinafter as input/output (IO) resources), are typically coupled to a computer via one or more intermediate interconnection hardware devices components that form a “fabric” through which communications between the central processing units and the IO resources are passed.
In lower performance computer designs, e.g., single user computers such as desktop computers, laptop computers, and the like, the IO fabric used in such designs may require only a relatively simple design, e.g., using an IO chipset that supports a few interconnection technologies such as Integrated Drive Electronics (IDE), Peripheral Component Interconnect (PCI) or Universal Serial Bus (USB). In higher performance computer designs, on the other hand, the IO requirements may be such that a complex configuration of interconnection hardware devices is required to handle all of necessary communications needs for such designs. In some instances, the communications needs may be great enough to require the use of one or more additional enclosures that are separate from, and coupled to, the enclosure within which the central processing units of a computer are housed.
Often, in more complex designs, peripheral components such as IO adapters (IOA's) are mounted and coupled to an IO fabric using “slots” that are arrayed in either or both of a main enclosure or an auxiliary enclosure of a computer. Other components may be mounted or coupled to an IO fabric in other manners, e.g., via cables and other types of connectors, however, often these other types of connections are referred to as “slots” for the sake of convenience. Irrespective of the type of connection used, an IO slot therefore represents a connection point for an IO resource to communicate with a computer via an IO fabric. In some instances, the term “IO slot” is also used to refer to the actual peripheral hardware component mounted to a particular connection point in an IO fabric, and in this regard, an IO slot, or the IO resource coupled thereto, will also be referred to hereinafter as an endpoint IO resource.
Managing endpoint IO resources coupled to a computer via an IO fabric is often problematic due to the typical capability of an IO fabric to support the concurrent performance of multiple tasks in connection with multiple endpoint IO resources, as well as the relative independence between the various levels of software in the computer that accesses the IO resources. For example, many IO fabrics are required to support the concept of interrupts, which are asynchronous, and often sideband, signals generated by IO resources to alert the central processing complex of a computer of particular events.
In many conventional IO fabrics, interrupts are level sensitive in nature, whereby interrupt signals are generated by asserting a signal on a dedicated line or pin. With complex IO fabrics, however, the number of dedicated lines or pins that would be required to provide interrupt functionality for all of the IO resources connected to the fabric may be impractical. As a result, many more complex IO fabrics implement message-signaled interrupts (MSI's), which are typically implemented by writing data to specific memory addresses in the system address space.
As an example, the PCI-X and PCI-Express standards support MSI capabilities, with the PCI-Express standard requiring support for MSI for all non-legacy PCI-Express compatible IOA's. To fully support MSI, not only do the IOA's need to support MSI, MSI must be supported by the other hardware components in the IO fabric, e.g., PCI host bridges (PHB's), root complexes, etc., as well as by the host firmware, e.g., the BIOS, operating system utilities, hypervisor firmware, etc. Furthermore, these components must be sufficiently flexible to allow varying types of IOA's, and varying configurations and MSI signaling capabilities of both IO fabric hardware and IOA's, to be supported.
Also, when a PHB or root complex in a logically partitioned system supports the partitioning of IOA's or PCI functions within an IOA, administration of MSI interrupt facilities in the PHB or root complex across the partitions and PCI functions sharing them becomes even more complex. Host firmware typically must implement MSI management functions and policies that adapt to varying adapter capabilities and configurations on a single PHB, using the PHB implementation. Furthermore, such management must accommodate the needs of multiple clients, be they operating systems, partitions, device drivers, etc., to avoid inter-client resource conflicts and ensure fair allocation among multiple clients.
One basic function required to provide MSI support is that of creating bindings between MSI resources and an interrupt facility of an underlying hardware platform. A binding represents a mapping between an MSI resource and an interrupt facility to ensure that an interrupt signaled by an MSI resource will be routed to an appropriate client via the interrupt facility. In many designs, for example, an interrupt facility will allocate specific interrupt “ports” to various clients, such that an MSI binding ensures that an interrupt signaled by an MSI resource allocated to that client will be directed to the port in the interrupt facility associated with that client.
A significant issue with respect to logically partitioned computers as well as more complex non-partitioned computers is that of high availability. Such computers are often required to support dynamic reconfiguration with minimal impact on system availability. In logically partitioned computers, for example, logical partitions may be terminated and reactivated dynamically, without impacting the availability of the services provided by other logical partitions resident on a computer. In addition, it may be necessary to reallocate system resources between logical partitions, e.g., to increase the capabilities of heavily loaded partitions with otherwise unused resources allocated to other partitions. Still further, many designs support the ability to perform concurrent maintenance on IOA's and other resources, including adding, replacing (e.g., upgrading), or removing IOA's dynamically, and desirably with little or no impact on system availability. Error recovery techniques may also dynamically reallocate or otherwise alter the availability of system resources. In each of these instances, MSI facilities may need to be adjusted to accommodate changes in the underlying hardware platform and/or in the allocation of system resources to different partitions in the computer.
An additional concern with respect to MSI support arises due to the wide variety of underlying hardware platforms that may utilize MSI. In many instances, operating systems and device drivers are desirably portable to different hardware platforms. If MSI management responsibility is allocated to an operating system or device driver, portability suffers due to the need for the operating system/device driver to account for variabilities in hardware platforms. Likewise, MSI management via a management facility that is separate from an operating system or device driver, e.g., as might be implemented in firmware, is likewise often unduly complicated due to a need to account for hardware platform variability.