1. Field of the Invention
This invention relates in general to the field of computer network architecture, and more specifically to an architecture that enables sharing and/or partitioning of input/output (I/O) endpoint devices within a load-store fabric.
2. Description of the Related Art
Modern computer architecture may be viewed as having three distinct subsystems which when combined, form what most think of when they hear the term computer. These subsystems are: 1) a processing complex; 2) an interface between the processing complex and I/O controllers or devices; and 3) the I/O (i.e., input/output) controllers or devices themselves.
A processing complex may be as simple as a single processing core, such as a Pentium microprocessor, it might be as complex as two or more processing cores. These two or more processing cores may reside on separate devices or integrated circuits, or they may be part of a single integrated circuit. Within the scope of the present invention, a processing core is hardware, microcode (i.e., firmware), or a combination of hardware and microcode that is capable of executing instructions from a particular instruction set architecture (ISA) such as the x86 ISA. Multiple processing cores within a processing complex may execute instances of the same operating system (e.g., multiple instances of Unix), they may run independent operating systems (e.g., one executing Unix and another executing Windows XP), or they may together execute instructions that are part of a single instance of a symmetrical multi-processing (SMP) operating system. Within a processing complex, multiple processing cores may access a shared memory or they may access independent memory devices.
The interface between the processing complex and I/O is commonly known as the chipset. The chipset interfaces to the processing complex via a bus referred to as the HOST bus. The “side” of the chipset that interfaces to the HOST bus is typically referred to as the “north side” or “north bridge.” The HOST bus is generally a proprietary bus designed to interface to memory, to one or more processing complexes, and to the chipset. On the other side (“south side”) of the chipset are buses which connect the chipset to I/O devices. Examples of such buses include ISA, EISA, PCI, PCI-X, and AGP.
I/O devices allow data to be transferred to or from a processing complex through the chipset on one or more of the busses supported by the chipset. Examples of I/O devices include graphics cards coupled to a computer display; disk controllers (which are coupled to hard disk drives or other data storage systems); network controllers (to interface to networks such as Ethernet); USB and Firewire controllers which interface to a variety of devices from digital cameras to external data storage to digital music systems, etc.; and PS/2 controllers for interfacing to keyboards/mice. I/O devices are designed to connect to the chipset via one of its supported interface buses. For instance, modern computers typically couple graphic cards to the chipset via an AGP bus. Ethernet cards, SATA, Fiber Channel, and SCSI (data storage) cards, USB controllers, and Firewire controllers all connect to the chipset via a Peripheral Component Interconnect (PCI) bus. PS/2 devices are coupled to the chipset via an ISA bus.
The above description is general, yet one skilled in the art will appreciate from the above discussion that, regardless of the type of computer, its configuration will include a processing complex for executing instructions, an interface to I/O, and I/O devices themselves that allow the processing complex to communicate with the outside world. This is true whether the computer is an inexpensive desktop in a home, a high-end workstation used for graphics and video editing, or a clustered server which provides database support or web services to hundreds within a large organization.
A problem that has been recognized by the present inventors is that the requirement to place a processing complex, I/O interface, and I/O devices within every computer is costly and lacks flexibility. That is, once a computer is purchased, all of its subsystems are static from the standpoint of the user. To change a processing complex while still utilizing the same I/O interface and I/O devices is an extremely difficult task. The I/O interface (e.g., the chipset) is typically so closely coupled to the architecture of the processing complex that swapping one without the other doesn't make sense. Furthermore, the I/O devices are typically integrated within the computer, at least for servers and business desktops, such that upgrade or modification of the computer's I/O capabilities ranges in difficulty from extremely cost prohibitive to virtually impossible.
An example of the above limitations is considered helpful. A popular network server produced by Dell Computer Corporation is the Dell PowerEdge 1750®. This server includes a processing core designed by Intel® (a Xeon® microprocessor) along with memory. It has a server-class chipset (i.e., I/O interface) for interfacing the processing complex to I/O controllers/devices. And, it has the following onboard I/O controllers/devices: onboard graphics for connecting to a display, onboard PS/2 for connecting a mouse/keyboard, onboard RAID control for connecting to data storage, onboard network interface controllers for connecting to 10/100 and 1 gigabit (Gb) Ethernet; and a PCI bus for adding other I/O such as SCSI or Fiber Channel controllers. It is believed that none of the onboard features is upgradeable.
As noted above, one of the problems with a highly integrated architecture is that if another I/O demand emerges, it is difficult and costly to implement the upgrade. For example, 10 Gigabit (Gb) Ethernet is on the horizon. How can 10 Gb Ethernet capabilities be easily added to this server? Well, perhaps a 10 Gb Ethernet controller could be purchased and inserted onto an existing PCI bus within the server. But consider a technology infrastructure that includes tens or hundreds of these servers. To move to a faster network architecture requires an upgrade to each of the existing servers. This is an extremely cost prohibitive scenario, which is why it is very difficult to upgrade existing network infrastructures.
The one-to-one correspondence between the processing complex, the interface to the I/O, and the I/O controllers/devices is also costly to the manufacturer. That is, in the example presented above, many of the I/O controllers/devices are manufactured on the motherboard of the server. To include the I/O controllers/devices on the motherboard is costly to the manufacturer, and ultimately to an end user. If the end user utilizes all of the I/O capabilities provided, then a cost-effective situation exists. But if the end user does not wish to utilize, say, the onboard RAID or the 10/100 Ethernet, then s/he is still required to pay for its inclusion. Such one-to-one correspondence is not a cost-effective solution.
Now consider another emerging platform: the blade server. A blade server is essentially a processing complex, an interface to I/O, and I/O controllers/devices that are integrated onto a relatively small printed circuit board that has a backplane connector. The “blade” is configured so that it can be inserted along with other blades into a chassis having a form factor similar to a present day rack server. The benefit of this configuration is that many blade servers can be provided within the same rack space previously required by just one or two rack servers. And while blades have seen growth in market segments where processing density is a real issue, they have yet to gain significant market share for many reasons, one of which is cost. This is because blade servers still must provide all of the features of a pedestal or rack server including a processing complex, an interface to I/O, and the I/O controllers/devices. Furthermore, blade servers must integrate all their I/O controllers/devices onboard because they do not have an external bus which would allow them to interface to other I/O controllers/devices. Consequently, a typical blade server must provide such I/O controllers/devices as Ethernet (e.g., 10/100 and/or 1 Gb) and data storage control (e.g., SCSI, Fiber Channel, etc.)—all onboard.
Infiniband™ is a recent development which was introduced by Intel Corporation and other vendors to allow multiple processing complexes to separate themselves from I/O controllers/devices. Infiniband is a high-speed point-to-point serial interconnect designed to provide for multiple, out-of-the-box interconnects. However, it is a switched, channel-based architecture that drastically departs from the load-store architecture of existing processing complexes. That is, Infiniband is based upon a message-passing protocol where a processing complex communicates with a Host-Channel-Adapter (HCA), which then communicates with all downstream Infiniband devices such as I/O devices. The HCA handles all the transport to the Infiniband fabric rather than the processing complex itself. Within an Infiniband architecture, the only device that remains within the load-store domain of the processing complex is the HCA. What this means is that it is necessary to leave the processing complex load-store domain to communicate with I/O controllers/devices. And this departure from the processing complex load-store domain is one of the limitations that contributed to Infiniband's demise as a solution to providing shared I/O. According to one industry analyst referring to Infiniband, “[i]t was over-billed, over-hyped to be the nirvana-for-everything-server, everything I/O, the solution to every problem you can imagine in the data center, . . . , but turned out to be more complex and expensive to deploy, . . . , because it required installing a new cabling system and significant investments in yet another switched high speed serial interconnect.”
Accordingly, the present inventors have recognized that separation of a processing complex, its I/O interface, and the I/O controllers/devices is desirable, yet this separation must not impact either existing operating systems, application software, or existing hardware or hardware infrastructures. By breaking apart the processing complex from its I/O controllers/devices, more cost effective and flexible solutions can be introduced.
In addition, the present inventors have recognized that such a solution must not be a channel-based architecture, performed outside of the box. Rather, the solution should employ a load-store architecture, where the processing complex sends data directly to or receives data directly from (i.e., in an architectural sense by executing loads or stores) an I/O device (e.g., a network controller or data storage controller). This allows the separation to be accomplished without disadvantageously affecting an existing network infrastructure or disrupting the operating system.
Therefore, what is needed is an apparatus and method which separate a processing complex and its interface to I/O from I/O controllers/devices.
In addition, what is needed is an apparatus and method that allow processing complexes and their I/O interfaces to be designed, manufactured, and sold, without requiring I/O controllers/devices to be provided therewith.
Also, what is needed is an apparatus and method that enable an I/O controller/device to be shared by multiple processing complexes.
Furthermore, what is needed is an I/O controller/device that can be shared by two or more processing complexes using a common load-store fabric.
Moreover, what is needed is an apparatus and method that allow multiple processing complexes to share one or more I/O controllers/devices through a common load-store fabric.
Additionally, what is needed is an apparatus and method that provide switching between multiple processing complexes and shared I/O controllers/devices.
Furthermore, what is needed is an apparatus and method that allow multiple processing complexes, each operating independently and executing an operating system independently (i.e., independent operating system domains) to interconnect to shared I/O controllers/devices in such a manner that it appears to each of the multiple processing complexes that the I/O controllers/devices are solely dedicated to a given processing complex from its perspective. That is, from the standpoint of one of the multiple processing complexes, it must appear that the I/O controllers/devices are not shared with any of the other processing complexes.
Moreover, what is needed is an apparatus and method that allow shared I/O controllers/devices to be utilized by different processing complexes without requiring modification to the processing complexes existing operating systems or other application software.