InfiniBand specifications describe the concepts of work queue pairs (QPs) and completion queues (CQs). To enhance scalability to a large number of logical partition and virtual machine resources, event queues (EQs) are added which are used by the Host Channel Adapter (HCA) to record a summation of events associated with QP and CQ resources. FIG. 1 illustrates a prior art InfiniBand system 10 having multiple processor nodes 100 interconnected through a fabric network 102 to, for instance, a Storage Subsystem 104, a RAID Subsystem 106, Consoles 108, and multiple I/O Chassis 110 through which are connected SCSI devices, Ethernet connections, Fibre Channel (FC) Hub and FC devices, Graphics and Video devices. The Fabric 102 includes multiple switches 112 and Routers 114 such that messages and data may be exchanged over the InfiniBand system 10. Each Processor Node 100 includes one or more Central Processor Units (CPU) 116, a memory 118, and a Host Channel Adapter (HCA) 120. The InfiniBand System 10 and the HCA 120 are well known and fully explained in the InfiniBand Architecture Specification by the InfiniBand Trade Association, Release 1.0.a, (2001).
Logical Partition (LPAR) concepts are discussed in Rogers et al., ABCs OF z/OS SYSTEM PROGRAMMING VOLUME 10, IBM Redbook, SG24-6990-00 (June 2004).
U.S. Pat. No. 6,944,847 B2 issued Sep. 13, 2005 to Desai et al. for VIRTUALIZATION OF INPUT/OUTPUT DEVICES IN A LOGICALLY PARTITIONED DATA PROCESSING SYSTEM discloses a hypervisor layer which synchronizes use of virtualized input/output devices that may regularly be used by multiple partitions of a logically partitioned data processing system by making them callable by any system partition to the hypervisor layer.
U.S. Pat. No. 6,748,460 B2 issued Jun. 8, 2004 to Brice, Jr. et al. for INITIATIVE PASSING IN AN I/O OPERATION WITHOUT THE OVERHEAD OF AN INTERRUPT discloses passing initiative to a processor for handling an I/O request for an I/O operation for sending data between a main storage and one or more devices.
U.S. Pat. No. 6,754,738 B2 issued Jun. 22, 2004 to Brice, Jr. et al. for LOW OVERHEAD I/O INTERRUPT discloses sending data to or receiving data from one or more I/O devices in an I/O operation with a main storage controlled by a processor in a data processing system.
U.S. Pat. No. 6,889,021 B2 issued Apr. 12, 2005 to Easton et al. for INTELLIGENT INTERRUPT WITH HYPERVISOR COLLABORATION discloses controlling the transfer of data in a data processing system having a processor handling an I/O request in a I/O operation, main storage controlled by the processor for storing data, and one or more I/O devices for sending data to or receiving data from the main storage.
U.S. Patent Application Publication US 2001/0049741 A1 published Dec. 6, 2001 by Skene et al. for METHOD AND SYSTEM FOR BALANCING LOAD DISTRIBUTION ON A WIDE AREA NETWORK discloses a system and method for balancing the load on virtual servers managed by server array controllers at separate data centers that are geographically distributed on a wide area network such as the internet.
U.S. Patent Application Publication US 2002/0173863 A1 published Nov. 21, 2002 by Imada et al. for VIRTUAL MACHINE SYSTEM AND VIRTUAL MACHINE CONTROL METHOD discloses a user interface function for a virtual machine system based on a server or a PC by applying software without using service processor or the like.
U.S. Patent Application Publication US 2003/0126265 A1 published Jul. 3, 2003 by Aziz et. al. for REQUEST QUEUE MANAGEMENT discloses method and apparatus for managing a dynamically sized, highly scalable and available server farm.
U.S. Patent Application Publication 2003/0133449 A1 published Jul. 17, 2003 by Fitzpatrick et al. for FAST PATH ROUTING IN A LARGE-SCALE VIRTUAL SERVER COMPUTING ENVIRONMENT discloses methods, systems, and computer program products for improving data transfer in complex computing environments. Internal routing enhancements are defined which enable traffic of virtual servers to be processed more efficiently, thereby improving overall data transfer rates.
U.S. Patent Application Publication US 2003/0154236 A1 published Aug. 14, 2003 by Dar et al. for DATABASE SWITCH ENABLING A DATABASE AREA NETWORK discloses a method and system for improving utilization of the typical DBMS client-server configuration and includes a Database Switch situated between the application and database servers in a network capable of dynamically and transparently connecting applications to databases using standard database servers and standard protocols.
U.S. Patent Application Publication US 2004/0143664 A1 published Jul. 22, 2004 by Usa et al. for METHOD FOR ALLOCATING COMPUTER RESOURCE discloses dynamically reallocating a computer resource to a plurality of virtual machine LPARs with optimum quantities of resource allocation being determined so that the virtual machine LPARs will hardly have resource shortages in the near future.
U.S. Patent Application Publication US 2004/0153614 A1 published Aug. 5, 2004 by Bitner et al. for TAPE STORAGE EMULATION FOR OPEN SYSTEMS ENVIRONMENTS discloses a virtual tape server residing on a network connectible on its front end to a plurality of heterogeneous backups hosts with different operating systems and/or backup applications, and on its back end to one or more disk storage devices in an open systems environment.
U.S. Patent Application Publication US 2004/0250254 A1 published Dec. 9, 2004 by Frank et al. for VIRTUAL PROCESSOR METHODS AND APPARATUS WITH UNIFIED EVENT NOTIFICATION AND CONSUMER-PRODUCER MEMORY OPERATIONS discloses a virtual processor that includes one or more virtual processing units which execute on one or more processors, with each virtual processing unit executing one or more processes or threads.
U.S. Patent Application Publication US 2005/0044301 A1 published Feb. 24, 2005 by Vasilevsky et al. for METHOD AND APPARATUS FOR PROVIDING VIRTUAL COMPUTING SERVICES discloses a level of abstraction created between a set of physical processors and a set of virtual multiprocessors to form a virtualized data center.
U.S. Patent Application Publication US 2004/0230712 published Nov. 18, 2004 by Belmar et al. for MANAGING INPUT/OUTPUT INTERRUPTIONS IN NON-DEDICATED INTERRUPTION HARDWARE ENVIRONMENTS discloses input/output interruptions managed in computing environments that do not use dedicated per-guest interruption hardware to present interruptions. Dispatchable guest programs in the environment receive I/O interruptions directly without hypervisor intervention.
Returning to FIG. 1, in the HCA 120 for use with the InfiniBand system which is provided by IBM, events that are recorded in EQs are classified as either completion events or non-completion events. Completion Events include when a program-initiated work request, as identified by a work-queue entry (WQE) in a QP, is completed by the HCA. A completion event may be recognized and completion-queue entry (CQE) is recorded in the CQ associated with the QP. If the EQ associated with the CQ does not already contain a pending EQE for a completion event, an EQE for a completion event is made pending in the EQ. Non-Completion Events include when errors associated with an HCA resource occur or the status or configuration of an HCA resource changes. A non-completion may be recognized, and if the EQ associated with the resource does not already contain a pending EQE for the type of non-completion event that was recognized, an EQE for that type of non-completion event is made pending in the EQ.
For an HCA, an operating system of the Processor Nodes 100 allocates one or more QPs and CQs, allocates a single EQ, and associates the QPs and CQs with that EQ. This forms a hierarchy in which the QPs and CQs are at the bottom and the single EQ (and its associated I/O interrupts) is at the top. Thus, completion and non-completion events for a single HCA may be mapped into a single EQ for each operating system.
This hierarchical design allows an operating system to efficiently demultiplex HCA events back to the individual QPs and CQs as required. Other IBM patents describe how to virtualize a given physical HCA to support multiple, separate logical partitions (within a single central-processing complex (PC)) concurrently by having each of the partitions own and manage its own separate QPs, CQs, and EQ. In this case, the HCA hardware performs a form of multiplexing by vectoring HCA events into the EQ of the partition which owns the resource (i.e., QP or CQ) for which the event is recognized.