(a) Field of the Invention
The described technology relates to a computing device, a data transfer method between a coprocessor and a non-volatile memory, and a computer-readable recording medium.
(b) Description of the Related Art
Data processing coprocessors with high computation parallelism and comparatively low power consumption are becoming increasingly popular. One example of the coprocessor is a graphic processing unit (GPU). In such the coprocessor, many processing cores share execution control and can performing identical operations on numerous pieces of data via thread-level parallelism and data-level parallelism. A system using the coprocessor together with a central processing unit (CPU) can exhibit significant speedups compared to a CPU-only system.
The coprocessors can process more data than they have ever had before, and the volume of such data is expected. However, the coprocessors employ on-board memory whose size is relatively smaller compared to a host memory. The coprocessors therefore use a non-volatile memory connected to a host machine to process large sets of data.
However, the coprocessor and the non-volatile memory are completely disconnected from each other and are managed by different software stacks. Consequently, many redundant memory allocations/releases and data copies exist between a user-space and a kernel-space in order to read data from the non-volatile memory or write data to the non-volatile memory. Further, since a kernel module cannot directly access the user-space memory, memory management and data copy overheads between the kernel-space and the user-space are unavoidable. Furthermore, kernel-mode and user-mode switching overheads along with the data copies also contribute to long latency of data movements. These overheads causes the speedup improvement to be not significant compared to the coprocessor performance.