During a running process of a computer processor, a speed for acquiring data from an off-chip memory by the processor directly affects efficiency of the processor.
A read/write speed of an off-chip memory is much lower than a data processing speed of a processor. Therefore, in order to reduce latency for a processor to read data, a caching technique cache is used in the prior art by taking advantage of temporal locality and spatial locality of a program, that is, a cache is disposed on a processor chip to cache data commonly used by the processor. A data read/write speed of the cache is relatively high. When reading data, the processor accesses the cache first. When the accessed data is not in the cache, the processor accesses an off-chip memory using a memory controller. With the cache, work efficiency of the processor can be effectively improved. To facilitate data management, data in a cache is managed with a granularity of a cache line, such as 64 bytes. When data is read or written between the cache and an off-chip memory, data is also read into an on-chip cache together with a granularity of a cache line.
However, when the foregoing processor reads or writes data, for an application program with poor data locality, the processor needs to repeatedly access the off-chip memory using the memory controller, which wastes relatively large access bandwidth. In addition, when a multi-core processor concurrently sends a large quantity of memory access operations to the memory controller, because a quantity of memory access requests that can be received and processed concurrently by the memory controller is limited, some memory access requests are congested in the memory controller and cannot be processed in a timely and efficient manner.