Generally, example embodiments of the present disclosure relate to hardware accelerators, and more particularly to providing a method, system, and computer program product for streaming attachment of hardware accelerators to computing systems.
General purpose processors like Intel®, AMD® and IBM POWER® are designed to support a wide range of workloads. If processing power beyond existing capabilities are required then hardware accelerators may be attached to a computer system to meet requirements of a particular application. Examples of hardware accelerators include FPGAs (Field Programmable Gate Arrays), the IBM Cell B.E. (broadband engine) processor, and graphics processing units (GPUs). Hardware accelerators are typically programmable to allow specialization of a hardware accelerator to a particular task or function and consist of a combination of software, hardware, and firmware. Such hardware accelerators may be attached directly to the processor complex or nest, by PCI-express (peripheral component interconnect) IO (input-output) slots or using high-speed networks, for example, Ethernet and Infiniband®.
Systems where processors have a static mapping of 1:1 (e.g., processor:accelerator), with a processor cluster and accelerator cluster packaged separately, may be severely limited by scalability. However, systems attempt to provide performance similar to processors mapped to several accelerators for larger tasks. It follows however, that as workloads for each processor increase, the resources available in the single mapped accelerator are quickly diminished, thereby reducing system throughput overall.
More clearly, if a processor in a system with 1:1 processor:accelerator ratio requires acceleration, the processor may have to wait until results are returned from the accelerator before additional data is passed to the accelerator. Such static 1:1 mappings limit throughput and performance. Therefore, it may be desirable to provide a solution that overcomes these drawbacks.