Packet buffers, for example queuing devices, are often implemented using a dynamic random access memory (DRAM) because of its low cost and low power. However, a DRAM imposes a challenge due to its long latency and its constraints on random accesses. A DRAM is organized in banks and physical properties of the DRAM impose restrictions on bank accesses. For example, the access parameter row-cycle time, tRC, gives the minimum time between an access to a row in a DRAM bank and a consecutive access to another row in the same DRAM bank. Another access parameter, the rolling time frame, tFAW, in which a maximum of four row activations one the same DRAM device may be engaged concurrently restricts the number of row activate commands within a time window. A row is a part of bank. A row must be activated before a read or write to an address within the row can be performed.
A memory controller for a DRAM receives read and write requests targeting different banks of the DRAM. As the DRAM bandwidth in terms of accesses per time unit is often a bottleneck, the memory controller for a DRAM may rearrange the order of read and write requests such that the utilization of the memory interface is maximized.
One optimization is to access the banks cyclically in a fixed order, thus ensuring that the time between two consecutive accesses to any DRAM bank is greater than or equal to the row-cycle time, tRC.
Another optimization is to rearrange read requests and write requests such that multiple read requests are followed by multiple write requests; e.g. rearranging the sequence S1=(R1, W2, R3, W4) to S2=(R1, R3, W2, W4) where R stands for Read, W stands for Write and the number indicates the order in which the requests are received by the memory controller. There is usually a bandwidth penalty for turning between read and write accesses to the DRAM, so S2 is completed in shorter time than S1.
The published US application US 2004/0236921 A1 to Bains discloses a method to improve bandwidth on a cache data bus so that cache memories, such as DRAMs, can be more efficiently used. In one embodiment, the read or write accesses are reordered to efficiently utilize the bandwidth on the data bus.
The U.S. Pat. No. 6,564,304 B1 to Van Hook et al. discloses a memory processing system for accessing memory in a graphics processing system, wherein a memory controller arbitrates memory access request from a plurality of memory requesters. Reads are grouped together and writes are grouped together to avoid mode switching.
However, reordering of accesses, such as read and write accesses may cause logical errors; e.g., if an address in the DRAM bank is read before getting written. For example, in the sequences S1=(R1, W2, R3, W4) and S2=(R1, R3, W2, W4) mentioned above, W2 and R3 may access the same bank address. If W2 writes an element of a data structure; e.g., a linked list, and R3 accesses the same element of the data structure, a logical error would occur if W2 and R3 are reordered as in S2 since that would make a program to read the address before it has been written to. That is, reordering would make a program parsing the linked list to use a stale pointer, causing program failure.
The published US application US 2007/0156946 A1 to Laskshmanamurthy et al. discloses a memory controller with bank sorting and scheduling. The memory controller comprises a FIFO buffer, an arbiter, a bank FIFO set and a bank scheduler. Outputs from the FIFO buffer are fed into the arbiter that sorts memory request into appropriate bank FIFOs. The arbiter may use a round robin arbitration scheme to sort and prioritize the input request streams. The bank scheduler receives the outputs from the bank FIFO sets and processes the requests in rounds. In each round the bank scheduler may select the transactions that optimize read/write efficiency, e.g. the bank scheduler may group reads and/or writes to minimize read-write turn-arounds.
In US 2007/0156946 A1 the problem of logical errors as described above is solved by an “out-of-order” mechanism that ensure that the transaction ordering rules governing reads and writes to the same address are never violated, i.e. that an address cannot be read before it has been written to.
A drawback/problem with the memory controller of US 2007/0156946 A1 is that it does not provide weighed, fair sharing of memory bandwidth. By storing read and write memory requests in the same bank FIFO, the sharing between read and write is determined by the request arrival process and not regulated by the memory controller.
Further, the memory controller of US 2007/0156946 A1 has an arbiter which ensures that e.g. a read request is not issued before a write request if these requests are for the same address. Thus the read request to DRAM is issued despite the existence of the data to be read in the internal storage of the memory controller. This means that DRAM bandwidth is not utilized optimally and read latency is not minimized.
Furthermore, the memory controller of US 2007/0156946 A1 has no means to prioritize requests that need low latency, e.g. requests related to control information, such that they are served before requests that tolerate longer latency, e.g. requests related to packet data.