The present invention relates to the control of access to a shared memory, and more particularly to a device, method, and computer program product for scheduling access requests from each requester to a memory shared by multiple requesters.
In various information processing apparatuses including computers and storage devices, a reduction in power consumption and cost has been a critical issue. For example, recent tape drives have adopted a configuration for sharing an external memory, such as a dynamic random access memory (DRAM), among multiple devices, including one or more processors, e.g., central processing units (CPUs). Sharing a memory may reduce the number of memory chips compared with each device having a specific memory, resulting in a reduction in power consumption and cost, and downsizing of a circuit board.
However, use of a shared memory system may lead to a longer turnaround time for memory access from a requester, such as a processor, and therefore adversely impact the performance of the shared memory system. The performance is also degraded when bus usage in the shared memory system is inefficient. Therefore, there is an urgent need to reduce the turnaround time and improve the bus usage efficiency in shared memory systems.
When an access request from a processor is sent to an external DRAM, the turnaround time of the access request depends on at least the protocol overhead of the DRAM (the time from when an address to be accessed is activated until the address is deactivated after the completion of the access). Further, if a second processor issues another access request while a first access request is being processed, the second processor will have to wait for its own access request to be processed until the processing of the first access request is completed. This increases the turnaround time of the second processor. Known techniques for improving the turnaround time in the shared memory system include a bank interleave mode (hereinafter called the BI mode) and a continuous read/write mode (hereinafter called the CN mode).
In the BI mode, for example, an active command may be used to open or activate multiple banks of the DRAM. A controller sends multiple access requests having different bank addresses to the activated multiple banks in an interleaved manner to enable a reduction in turnaround time.
In the CN mode, the controller may issue a write command or a read command having the same bank address and row address as the bank address and row address specified in the previous access request to continue access cycles, and this may lead to a reduction in protocol overhead and turnaround time.
The BI mode and the CN mode contribute to the reduction in protocol overhead and the improvement of the DRAM bus efficiency. However, when the address of an access request does not meet conditions in the BI mode and the CN mode, transfer in a normal mode is performed. In the normal mode, addresses are activated and deactivated for each read or write command.
An example of protocol overhead when a double data rate type three synchronous dynamic random access memory (DDR3 SDRAM) is used as the DRAM and two read commands are continuously processed is shown in FIGS. 1A-1C. Each alphabetical letter in FIGS. 1A-1C denotes a command described in U.S. Patent Application Publication No. 2014/0059286 A1, where A denotes an active command, R denotes a read command, and P denotes a precharge command. Further, DQ denotes a data signal and DQS denotes a data strobe signal.
(A) Normal Transfer
In the case of a normal transfer as shown in FIG. 1A, 26 clocks are required for each read command R from the start of the activation of a bank address and a row address by an active command A until completion of the deactivation of these addresses by a precharge command P. Therefore, protocol overhead when two read commands R are continuously processed is 52 clocks.
(B) BI Mode
In the BI mode as shown in FIG. 1B, since reading from two banks is performed, two reads are continuously performed after being activated by active commands A, and a precharge command is finally executed once. The total protocol overhead is 35 clocks.
(C) Normal+CN Mode
Since transfer in the CN mode is performed on the same bank address and row address as those of the preceding command (here, the read command in the normal mode), it is only necessary to activate the address by an active command A only initially and to execute a precharge command only once at the end, as shown in FIG. 1C. Therefore, the protocol overhead is 28 clocks, which is the shortest. The CN mode is the best in terms of only the protocol overhead.
U.S. Patent Application Publication No. 2014/0059286 A1 describes a memory access device for processing multiple access requests in one bus cycle. This memory access device includes a CPU interface connected to multiple CPUs using a memory as a main memory to control access transfer from the multiple CPUs to the memory, and a DRAM controller for arbitrating access transfer to the memory. The CPU interface keeps access requests from the multiple CPUs waiting, receives and stores the addresses, data transfer mode, and data size of each access, and notifies the DRAM controller of the access requests. When receiving a permission signal for an access request, the CPU interface sends information to the DRAM controller in response to the permission signal. The DRAM controller receives an access request signal, specifies a CPU permitted to perform a transfer based on access arbitration, and sends the permission signal to the CPU interface.
Further, in U.S. Patent Application Publication No. 2013/0179645 A1, a method for equalizing the waiting times of multiple requesters using a shared memory system is described. According to the method, the longest-waiting access request is selected among the access requests from the multiple requesters to send the longest-waiting access request to the shared memory system after the other access requests so that a requester issuing the longest-waiting access request may send an additional access request to the shared memory system following the permitted longest-waiting access request.