1. Field
The present invention generally relates to a data transfer system, and more particularly to a burst transfer system for transferring a plurality of data in succession.
2. Description of the Related Art
A system LSI (Large Scale Integration) includes function modules depending on the use of the system in addition to a CPU for controlling the entire LSI and a memory controller for accessing a memory which is provided externally of the LSI. For example, an LSI for image processing and data processing includes, as the function modules, a graphic processor for the image processing, a DSP (Digital Signal Processor) for the data processing, a DMAC (Direct Memory Access Controller) for data transfer between memories, etc. Each of those units operates as a master module that requests access to a bus and then executes the access. When one master module issues a data transfer request to an external memory, the memory controller controls a command for the memory, which is disposed externally of the chip (LSI), in response to the data transfer instruction (request). When plural master modules issue data transfer requests at the same time, an arbitration circuit operates to ensure a band necessary for the instruction for which data processing requires to be executed with higher priority. In such a system LSI, the processing performance greatly depends on the transfer processing speed of the memory controller.
A data transfer instruction via an on-chip bus (internal bus) of the system LSI is executed in accordance with a particular interface protocol. As common interface protocols, AMBA provided by ARM Ltd. is widely used and OCP proposed by OCP-IP (Open Core Protocol International Partnership) is also widely known.
In the common interface protocol, a maximum transfer length in one burst transfer is limited. The maximum transfer length is, e.g., 16 in AXI (Advanced eXtensible Interface) and AHB (Advanced High-Performance Bus) of AMBA. Therefore, when the master module tries to execute transfer of data in a comparatively large amount, it executes the burst transfer in the maximum transfer length plural times.
Let here suppose the case that a CPU serving as the master module issues a read request to an external memory. It is also assumed, for example, that the burst length of the memory is set to 4. First, the CPU designates an address A00 and issues a burst transfer instruction with the transfer length of 4. Responsively, a memory controller designates the address A00 to the external memory and issues a read command. After several cycles corresponding to the CAS latency have lapsed from the issuance of the read command, four data are successively read starting from the address A00 of the memory and are received by the memory controller. The memory controller supplies the received four read data to the CPU through an internal bus. A delay from the issuance of the burst transfer instruction by the CPU to the reception of the first data by the CPU is called “initial access latency”. The initial access latency depends on the CAS latency of the external memory, a wiring delay on a board through connection between the LSI and the external memory, a delay between the CPU and the memory controller inside the LSI, etc.
When the amount of data as a target of the data transfer is larger than 4, the CPU continuously issues the data transfer instruction. More specifically, the CPU designates a subsequent address A04 and issues another burst transfer instruction with the transfer length of 4. Responsively, the memory controller designates the address A04 to the external memory and issues another read command. After several cycles corresponding to the CAS latency have lapsed from the issuance of the read command, four data are successively read starting from the address A04 of the memory and are received by the memory controller. The memory controller supplies the received four read data to the CPU through the internal bus. Subsequently, the burst read of four data is repeated in a similar manner until the transfer of data in the required amount is completed.
When data is transferred in the required data amount by executing the burst transfer plural times as described above, a waiting time occurs in number of cycles corresponding to the initial access latency whenever the burst transfer is executed. In a system for which high-speed data transfer processing is required, it is not preferable that the initial access latency occurs for each burst transfer.
Japanese Laid-open Patent Publication No. 11-232171 describes a method of calculating an address designated in a data transfer instruction, issued from a master module, within a memory controller and pre-fetching memory data in order to suppress reduction of the data processing capability, which would be caused by the initial access latency. With the disclosed method, however, wasteful memory access is generated for the reason that, because the memory data is always pre-fetched, the previously pre-fetched data have to be discarded at the end of the data transfer.