The emergence of the cloud for computing applications has increased the demand for off-site installations, known as data centers, that store data and run applications accessed by remotely connected computer device users. Such data centers typically have massive numbers of servers, switches, and storage devices to store and manage data. A typical data center has physical rack structures with attendant power and communication connections. The racks are arranged in rows throughout the room or rooms of the data center. Each rack includes a frame that has slots or chassis between two side walls. The slots may hold multiple network devices such as servers, switches, and storage devices. There are many such network devices stacked in such rack structures found in a modern data center. For example, some data centers have tens of thousands of servers, attendant storage devices, and network switches. Thus, a typical data center may include tens of thousands, or even hundreds of thousands, of devices in hundreds or thousands of individual racks.
In order to efficiently allocate resources, data centers include many different types of devices in a pooled arrangement. Such pooled devices may be assigned to different host servers as the need for resources arises. Such devices may be connected via Peripheral Component Interconnect Express (PCIe) protocol links between the device and the host server that may be activated by a PCIe type switch.
Thus, many modern data centers now support disaggregate architectures with numerous pooled devices. An example of such a data center 10 is shown in FIG. 1. A system administrator may access a composer application 12 that allows configuration data to be sent via a router 14 to a PCIe fabric box 20. The PCIe fabric box 20 includes numerous serial expansion bus devices such as PCIe compatible devices that may be accessed by other devices in the data center. In this example, the PCIe fabric box 20 includes a fabric controller 22 that may receive configuration data through a network from the router 14. The fabric box 20 includes PCIe switches, such as the PCIe switches 24 and 26, that allow host devices such as host servers 30, 32, 34 and 36 to be connected to different PCIe devices in the fabric box 20. The PCIe switch 24 includes upstream ports 40 and 42 and the PCIe switch 26 includes upstream ports 44 and 46. The upstream ports 40, 42, 44, and 46 are connected via a cable to the host servers 30, 32, 34 and 36. The PCIe switch 24 also includes downstream ports 50, 52, 54, and 56. The PCIe switch 26 includes downstream ports 60, 62, 64, and 66. In this example, there are multiple devices in the fabric box 20 coupled to the respective downstream ports of the switches 24 and 26. These devices may be accessed by any of the host servers 30, 32, 34 and 36.
As shown in FIG. 1, two host servers 30 and 32 are directly coupled to the upstream ports 40 and 42 of the switch 24, while two host servers 34 and 36 are directly coupled to the upstream ports 44 and 46 of the switch 26. Different devices that are connected to the other switch 24 through the switch 26 may be allocated to the host servers 34 and 36. In this example, devices 70, 72, 74, and 76 are directly coupled to the downstream ports 50, 52, 54, and 56 of the PCIe switch 24. Devices 80, 82, 84, and 86 are directly coupled to the downstream ports 60, 62, 64, and 66 of the PCIe switch 26. Additional devices and host servers may be supported by adding additional PCIe switches. The example system 10 allows certain system resources to be removed from host servers and provided by the outside fabric box 20 instead. Thus, different types of system resources may be allocated to the needs of different servers as they arise. For example, the devices 70, 72, 74, 76, 80, 82, 84, and 86 may each be a resource such as a non-volatile memory (NVMe), a graphic processing unit (GPU), a field programmable gate array (FPGA), a network interface card (NIC), or other kinds of PCIe compatible devices. Each such device may be dynamically assigned to hosts, such as the host servers 30, 32, 34, and 36.
When a user desires to allocate a resource to a host, the user may send a request such as “Allocate One GPU to Host 1” to the composer application 12. The composer application 12 then allocates a GPU device such as the device 70 to the host server 30 by sending a command to the PCIe fabric controller 22. The fabric box 20 will allocate the first GPU device 70 to the host server 30 through the PCIe switch 24 or 26, as shown in a box 90 in FIG. 2.
Following the above allocation, a user may allocate another device resource to another host server. For example, the system administrator may send another request like, “Allocate One GPU to Host 3” to the composer application 12 to allocate another GPU such as the device 72 to the host server 34. The fabric box 20 will then allocate another GPU device 72 to the host server 34 through the PCIe switch 24 and the PCIe switch 26, as shown in a box 92 in FIG. 3.
One problem with fabric boxes such as the fabric box 20 in FIGS. 1-3, is the use of different fabric switches to allocate the devices to different host servers. In this case, the pooled devices have different path distances to different hosts. For purposes of the system 10, path distances are measured in number of devices required to connect the host to the device. In this example, the composer 12 may allocate a device to a host server where the distance between the device to the host is not optimal. For example, as shown in FIG. 2, when the device 70 is assigned to the host server 30, the PCIe switch 24 allows the connection to the host server 30. Thus, the distance between the host server and the device is optimized at 1. In contrast, when the device 72 is assigned to the host server 34, the connection requires spanning both PCIe switches 24 and 26, to connect to the host server 34 as shown by the box 92 (shown in FIG. 3). Thus, the path distance between the host server and the device is 2 and is not an optimized allocation. A better allocation is to allocate one of the devices 70, 72, 74, or 76 to the host server 34 as their path distances are only through the PCIe switch 26. However, the composer application 12 does not have the necessary data to perform optical allocations and may therefore result in longer path distances than necessary between an allocated device and host server.
There is need for a fabric box system that determines the minimum amount of path distance between assigned devices and a host server. There is also a need for a fabric box that will automatically factor the path distance between a requested type of device and a host in selecting a device for assignment to the host.