The present invention relates to information processing apparatus, an information processing methods and programs.
In von-Neumann machines a central processor (CPU=central processing unit or GPU=graphics processing unit) may employ several mechanisms to overcome the so called “Memory Wall”, which is a term to denote the growing performance gap between ever faster processors and comparably slower memory technologies. These mechanisms are in particular focused on tolerating longer access latencies of the main memory system in order to minimize the time that the processor's execution units are stalled. Or in other words: to maximize the utilization of the execution unit(s).
One of the features of these mechanisms is the use of a memory hierarchy comprising multiple levels of fast caches. Other mechanisms include support for out-of-order execution of instructions and multi-threading which both allow to continue processing with different instructions and/or threads when certain instructions or threads have been stalled while waiting for data to arrive from the memory system. Another example of a mechanism to reduce the (average) access latency is a prefetching of data from the memory system.
The above-mentioned techniques were disclosed in a time when the processor and memory system designs were not limited by power. Furthermore, the focus was mainly at maximizing the execution pipeline utilization by reducing the memory access latency. As a result, these mechanisms are typically among the most power-hungry components of a computer system, also wasting a considerable amount of memory bandwidth. For example, if the processor only needs a single byte, still a complete cache line may be retrieved from the memory system from which the remaining bytes are not used. The same applies to the prefetching of data that is typically only partially processed, if at all. Both cases do not only waste memory bandwidth, but also waste power for unneeded data accesses and operations.
In recent years, Field Programmable Gate Array (FPGA) technology continued to grow in importance as one of multiple programmable off-the-shelf accelerator technologies that can be used to improve performance and optimize power for selected application domains. As part of ongoing FPGA developments, the concept of dynamic partial reconfiguration support is gaining attention. The latter allows reconfiguring only selected parts of an FPGA during runtime.