This invention relates to an information processing apparatus configured to process an I/O command.
In recent years, technologies of rapidly analyzing a large amount of data have attracted attention for business utilization. In general, a host processor (hereinafter also referred to as “processor”) of a server reads data from a storage device, for example, a hard disk drive (HDD), and analyzes or operates the data.
A solid state drive (SDD), which has a flash memory as its storage medium and can be accessed more rapidly than the HDD, is becoming popular for use as the storage device. Further, semiconductor storage media such as a resistance random access memory (ReRAM) and a phase change memory (PCM), which can be accessed more rapidly than the flash memory, are increasingly put into practical use.
The rise of such storage devices has enabled a large amount of data to be read rapidly. However, bottlenecks such as high processing loads on the processor and the bandwidth of a bus coupled to the processor cause time consumption in data transfer. As a result, the performance of such rapid storage devices cannot be fully utilized, leading to a failure to speed up the information processing apparatus.
Hitherto, there has been known a technology of incorporating an apparatus (hereinafter referred to as “accelerator”) having an arithmetic function into the information processing apparatus and distributing a part of processing, which is executed by the processor normally, to that accelerator. For example, there is known a technology of incorporating, as the accelerator, a graphics processing unit (GPU) into a server having a processor and causing the GPU to process a part of program processing, which is executed by the processor normally, to thereby improve a processing speed.
This technology involves a large amount of data transfer in that the processor transfers data to be processed from the storage device to a system memory coupled to the processor and the processor further transfers the data from the system memory to the accelerator, to thereby enable the GPU to process the data. In particular, the data flows through a bus coupled to the processor frequently, and thus the bandwidth of the bus sometimes becomes a bottleneck for performance improvement.
In order to resolve the data transfer bottleneck, in US 2014/0129753 A1, there is disclosed an information processing apparatus in which the accelerator and the storage device directly communicate to/from each other without intervention of the processor to further improve the processing speed.
In the technology of US 2014/0129753 A1, a pair of a GPU and a non-volatile memory array is mounted on a board, the board is coupled to an information processing apparatus including a processor and a system memory, and the GPU and the non-volatile memory array directly transfer data to/from each other. The data of the non-volatile memory array is transferred to the GPU, and only the result of processing by the GPU is transferred to a bus coupled to the processor. Thus, it is possible to prevent access to the system memory from limiting the bandwidth of the bus.