1. Field of the Invention
The present invention relates generally to methods and apparatus for memory interleaving. More particularly, the invention relates to methods and apparatus for interleaving memory transactions into an arbitrary number of memory banks and to an interleaved memory system that need not have equal size memory banks.
2. Discussion of the Related Art
The history of computing technology has been one of an ever increasing appetite for memory. Though the unit cost of memory is decreasing, memory system cost is not. This is so primarily because systems need increasing amounts of memory.
Memory is typically a performance bottleneck in a computing system. Complicating this matter, memory may hold both programs and data. Each has unique characteristics pertinent to memory performance. For example, when a program is being executed, memory traffic is typically characterized as a series of sequential reads. On the other hand, when a data structure is being accessed, memory traffic is usually characterized by a stride, i.e., the difference in address from a previous access. A stride may be random or fixed. For example, repeatedly accessing a data element in an array may result in a fixed stride of two.
FIG. 1A illustrates a crude memory architecture 1 provided for descriptive purposes only. Memory chips 10 are arranged into an independently controllable array 30 ("a memory bank"). Under the control of a memory control circuit 11, a bank can operate on one transaction at a time. The memory chips may be of dynamic storage technology ("DRAMS"), or of static ram technology. DRAM technology is usually slower than computing circuitry and the like. Static RAM technology, on the other hand, is much faster than DRAMs, but is also more costly and, as such, is usually used more judiciously.
A hypothetical operational speed relationship among the devices is illustrated by using a unit time delay .tau.. In particular, the computing circuitry 20 has a time delay of .tau. and, consequently, may operate at a frequency of 1/.tau.. The memory bank 30, on the other hand, has a time delay of 10.tau. and, at best, may operate at a frequency of 1/10.tau.. The effective speed of the system will be something between 1/.tau. and 1/10.tau. and will depend upon how often memory 30 is accessed. Because, in reality, memory 30 will be accessed nearly every cycle, the effective speed of the system will likely approximate 1/10.tau..
To address the performance gap between memory and the computing circuitry, caches of memory data have been used, see FIG. 1B. A cache 12 holds a subset of the memory items stored by memory 30. A cache 12 is typically smaller than the size of memory 30 and is typically constructed with faster technology, such as static rams. A cache is characterized by a "hit rate," which indicates how successful the cache is at holding the memory items that are actually needed by the system: the higher the hit rate, the higher the effective speed of the system. Though many cache design features affect the hit rate, it is generally recognized that a larger cache size produces a higher hit rate. Cache memories are well known and will not be described in detail. To improve the memory system bandwidth, interleaving of memory banks has been used. FIG. 1C illustrates a two-way interleaved memory system having two equal size banks 31 and 32. (Four-way interleaving would have four equal size banks, etc.) The computing circuitry 20 may request a memory item from bank 31 at absolute time .tau. and, then before bank 31 provides the item, request another item from bank 32 at time 2.tau.. Bank 31 returns the item at absolute time 11.tau., i.e., 10.tau. after it received the request, and bank 32 returns the other item at absolute time 12.tau.. Consequently, the resulting memory performance is effectively doubled, because two items, rather than just one, have been returned in a time of 10.tau..
So-called "low-order" interleaving uses the least significant memory address bits to select a memory bank. For example, in two-way, low-order interleaving the least significant bit is used to select a bank. The more significant memory address bits are then used as an offset within the selected bank. Alternatively, so-called "high-order" interleaving uses the most significant memory address bits to select a bank and uses the less significant bits as an offset within the selected bank. The type of interleaving, e.g., "low-order", is often returned to as "the interleaving algorithm."
U.S. Pat. No. 5,293,607, entitled "Flexible N-Way Interleaving," describes an example of an interleaved memory system.
Although systems often need more memory, the prior art places substantial limitations on the ability to upgrade the memory size or the interleaving organization. This is true for both DRAM memory systems and static ram systems, such as caches. To add memory banks, most prior art systems require that the number of banks be increased to the next highest power of 2. This may be inconvenient, especially when larger jumps are necessitated, such as from eight-way to sixteen-way. Moreover, the prior art requires the banks to be of equal size. This may force a system user to use memory technology that does not have an optimum cost-performance. This is especially true, when considering that memory technology changes frequently. These limitations are especially significant because system users often have a substantial investment in the memory system. As such, users do not want to discard their existing investment. They would much rather have the ability to keep their existing memory investment, yet be able to upgrade with new technology.
Accordingly, there is a need in the art for a flexible interleaving system to select and address an arbitrary number of memory banks.