Although dynamic random access memory (DRAM) remains the memory of choice for a broad class of computing and consumer electronics applications, DRAM core access times have not scaled with memory bandwidth demand. For example, the minimum time between activation of different storage rows in the same storage bank, tRC, remains in the neighborhood of 40 nanoseconds for predominant core technologies; a substantial access time penalty for processors operating at gigahertz frequencies. Other core access times such as the minimum time between activation of rows in different banks of a multi-bank array, tRR, and minimum time between column access operations (i.e., read or write operations at a specified column address) in the same row, tCC, have also been slow to improve.
Designers have countered core timing limitations through a number of architectural and system-level developments directed at increasing the number of column access operations per row activation (e.g., paging, multi-bank arrays, prefetch operation), and maximizing the amount of data transferred in each column access. In particular, signaling rate advances have enabled progressively larger amounts of data to be transferred per column access, thereby increasing peak memory bandwidth. However, as signaling rates progress deeper into the gigahertz range and the corresponding core access times remain relatively constant, column transaction granularity, the amount of data transferred per column access, is forced to scale upwards and is approaching limits imposed by signal paths within the DRAM itself. Further, the trend in some classes of data processing applications, graphics applications for example, is toward smaller data objects (e.g., triangle fragments of a 3D scene) that are often stored in dispersed memory locations. In such applications, the additional power and resources expended to increase the column transaction granularity may provide only limited increase in effective memory bandwidth as much of the fetched data may not be used.