The rapid and continual decrease in cost and increase in complexity of silicon devices has allowed for multiple processors to be used in specialized compute nodes on a network. Specialized storage nodes can have arrays of disk drives to store databases or other files that are accessed by the compute nodes. Ethernet switches, routers, firewalls, and load-balancer devices can connect the compute nodes to an external networks such as the Internet. Several clusters of compute and storage nodes can each have switches to allow connection to other clusters over the Internet, allowing all clusters to operate as a large multi-processing server system even when the clusters are remotely located from one another.
FIG. 1 shows a multi-cluster server system. Two clusters are shown, and these clusters may be located together in a single location or may be remote from each other and connected by network 208 which can be the Internet or another network such as a virtual-private network (VPN), an Intranet, leased trunk lines, or other kinds of network.
In cluster A, storage nodes 204 contain databases and other files that are accessed by compute nodes 202. Compute nodes 202 include processing nodes that run server software to respond to client requests received over network 208. Switch chassis 206 contains Ethernet or other Local-Area-Network (LAN) switches and routers that connect compute nodes 202, and a load-balancer to distribute incoming client requests among servers running in compute nodes 202. Firewall or other gateway programs may be running on switch chassis 206. Any or all of the storage, Load Balancer, firewall, etc. may or may not be present in the configuration.
Cluster B also has storage nodes 214 and compute nodes 212. Switch chassis 216 contains switches and routers that connect compute nodes 212, and a load-balancer to distribute incoming client requests among servers running in compute nodes 212.
Compute nodes 202 are typically located together in one or more chassis. Each chassis contains slots or racks, and each rack can have multiple processors on one or more printed-circuit boards (PCBs) that slide into the rack. Storage nodes 204 are typically located in a separate chassis, such as for Network-Attached Storage (NAS) or Storage-Area Networks (SAN), since the rotating disk drives often have a different physical form factor than the compute PCB's in compute nodes 202. Some systems may just have a disk drive on a rack mount in a shared chassis. Disk controller cards are also located in storage nodes 204, and these controller cards likewise often have different form factors than processor cards in compute nodes 202.
One or more local switches can be placed in each chassis for compute nodes 202 and storage nodes 204. However, switch chassis 206 contains cluster-wide switching devices and a load-balancer, firewall, and gateway devices that are used by the whole cluster. These specialized devices often have differing form factors and may be located in separate specialized chassis or in switch chassis 206 Thus three kinds of chassis or cabinets are often used together in each cluster, for compute nodes 202, storage nodes 204, and switch chassis 206.
FIGS. 2A-C are diagrams of a prior-art data center with three levels of hierarchy. In FIG. 2A, in the lowest level of the data center's hierarchy, three compute nodes 220 and one storage node 224 are located together on rack 230. Storage node 224 could be a controller card to a disk drive located in a separate chassis, or could be in a same chassis with compute nodes 220.
Rack 230 also contains rack switch 222. Rack switch 222 is an Ethernet switch that connects to compute nodes 220 and storage node 224 using Ethernet links 228. Rack switch 222 also has an external link, cluster Ethernet link 226, which links to the next higher level of the data center's hierarchy.
In FIG. 2B, in the middle level of the data center's hierarchy, four racks 230 and aggregation switch 232 are located together in cluster 240. Each rack 230 also contains rack switch 222, which connects to aggregation switch 232 over cluster Ethernet links 226. Aggregation switch 232 is an Ethernet switch has an external link, data-center Ethernet link 236, which links to the next higher level of the data center's hierarchy.
In FIG. 2C, in the top level of the data center's hierarchy, four clusters 240 and core switch 242 are located together in data center 250. Each cluster 240 also contains aggregation switch 232, which connects to core switch 242 over data-center Ethernet links 236. Core switch 242 is an Ethernet switch has an external link that connects to Internet 246 through firewall and load-balancer 248, which acts as a gateway device.
While such aggregated data-center architectures are useful, the multiple levels of hierarchy each have Ethernet or other LAN switches. These switches are expensive and slow the passage of messages and packets. Furthermore, the different form factors of compute nodes and switches may require different kinds of chassis to be used, or even more expensive specialized chassis with local switches such as rack switches 222 in clusters 240.
What is desired is a data center architecture that reduces the number of LAN switches. It is desired to eliminate rack switch 222 and aggregation switch 232 by using a direct interconnect fabric that directly connects processor and storage nodes. It is desired to expand the use of this direct interconnect fabric to include the functions of rack switch 222 and aggregation switch 232. It is further desired to expand the use of a direct interconnect fabric that is used to transparently virtualize peripherals such as Network Interface cards, Ethernet cards, hard disks, BIOS, and consoles.