As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
Some information handling systems are designed to handle computationally intensive workloads, including deep learning workloads. For purposes of this disclosure, deep learning refers to a subset of machine learning methods, architectures, systems, and applications. Machine learning encompasses a branch of data science that emphasizes methods for enabling information handling systems to construct analytic models that use algorithms that learn from data interactively. It is noted that, although disclosed subject matter may be illustrated and/or described in the context of a deep learning system, method, architecture, or application, such a system, method, architecture, or application is not limited to deep learning and may encompass one or more other computationally intensive solutions.
Some information handling systems, including information handling systems designed for computationally intensive applications, employ computational accelerators in conjunction with a central processing unit (CPU) to improve the computational performance of the applicable solution. In such information handling systems, a graphics processing unit (GPU) and, more typically, multiple GPUs may be used as computational accelerators. For purposes of this disclosure, a GPU is an integrated circuit device featuring a highly parallel architecture that employs large numbers of small but efficient cores to accelerate computationally intensive tasks.
Employing GPUs, or any other computational accelerators, in conjunction with one or more CPUs, requires interconnectivity among the GPUs and CPUs. Interconnecting two or more GPUs with one or more CPUs is generally challenging due to a number of factors. Under loading, GPUs tend to consume significantly more power and produce significantly more heat than CPUs, thus limiting the number of GPUs that may be included within a defined space or provided on a single circuit board. Using two or more distinct compute nodes, each having its own CPU and GPUs, may address heat and power issues, but external interconnects coupling distinct compute nodes generally employ peripheral component interconnect express (PCIe) interconnects. In such systems, the GPU-to-GPU data transfer rate between GPUs on different compute nodes may be undesirably limited by the data transfer rate of the inter-node PCIe interconnect. If, as an example, a multi-node GPU-based accelerator used in a particular deep learning solution employs PCIe interconnects for inter-node GPU-to-GPU data transfers, the overall performance of the GPU accelerator may be undesirably limited by the interconnect.