1. Field
The present invention generally relates to the design of computer systems. More specifically, the present invention relates to a technique for efficiently interleaving addresses between a number of entities in a computer system, wherein the number of entities need not be a power of two.
2. Related Art
In order to compensate for the relatively low bandwidth provided by certain types of memory, such as dynamic random-access memory (DRAM), many computer systems provide interleaved memory systems. In such memory systems, data is distributed across multiple memory modules, which enables the memory system to subsequently access the data from multiple memory modules in parallel, thereby increasing memory-system throughput. Conceptually, a memory interleaving can be viewed as a mapping from a set of X consecutive addresses (e.g., cache line addresses) to a set of Y entities (such as processors, DIMMs, ranks, memory banks, cache banks, etc.) such that groups of consecutive addresses tend to map to different entities. Memory interleaving is a useful technique for increasing bandwidth and reducing hot spots caused by spatial locality in programs.
More quantitatively, the “load” of a memory interleaving can be defined as the maximum number of addresses which are mapped to a single entity from any window of Y consecutive addresses. Interleaving techniques can be optimized to minimize this “load” metric, and in doing so to minimize hot spots.
A well-known memory interleaving technique is to map an address A to entity A modulo Y. This “modulo-based” interleaving technique has a load equal to 1 (which is the best possible). However, implementing a modulo Y operation in hardware can be expensive (in terms of latency, design complexity, and area). Thus, designs that use this modulo-based interleaving technique typically constrain Y to be a power of two (in which case the modulo operation simply involves extracting the log-base-two-of-Y least-significant bits of A).
An alternative technique supports interleaving among Y entities where Y need not be a power of two. In this alternative technique, Y is viewed as a sum of B different powers of two. This alternative interleaving technique partitions the Y entities into B groups, each of which contains a power-of-two number of entities. There is no interleaving between groups; the only interleaving that is done is within each group (which is fully interleaved). This alternative interleaving technique has a load which is a function of the size of the smallest group. For example, if Y=33, the smallest group contains 1 processor and the load is 33, which indicates that hot spots will be likely.
Hence, what is needed is a technique for interleaving addresses between Y entities, where Y need not be a power of two, which does not suffer from the drawbacks of the above-described techniques.