The invention is directed to a computer system with multiple memory subsystems and more particularly to interleaving access of the subsystems.
Historically, main memory was physically situated on a central bus. Within this type of system, memory requests consisting of full physical addresses, were forwarded to the memory subsystem and the data was returned. In a distributed memory system, main memory is physically distributed across many different cells. A cell may consist of a number of processors, an input/output (I/O) device, a cell controller, and memory.
In a distributed system, memory can be noninterleaved or interleaved. Prior art systems of and methods for interleaving memory are described and set forth in, for example, U.S. Pat. No. 5,530,837 issued Jun. 25, 1996 to Williams et al. and U.S. Pat. No. 5,293,607 issued Mar. 8, 1994 to Brockman et al., both patents are assigned to the owner of the present invention, and both of which are incorporated herein by reference in their entireties. In a noninterleaved access method wherein memory is divided into or across multiple physical cells, a unified, contiguous block of memory space is addressed by first sequentially accessing all memory of a first cell followed by sequential access of all memory available in a second cell, etc. If each cell has been configured with its maximum amount of possible memory, the memory will appear, and be addressed as one contiguous memory block to the system. However, if not every cell is configured to its maximum memory capability, this noninterleaved scheme may result in holes within the memory space corresponding to missing memory blocks within the cells. Noninterleaved memory also requires multiple, sequential access to a particular cell since both instructions and data tend to be used sequentially. While a benefit when stored locally, a processor continuously or frequently accessing a remote memory in another cell consumes significant overhead including processing and communications resources at both the local and remote cells and the connecting network(s). When substantially continuous, these resources may become unavailable to other processes and degrade system performance.
Alternatively, the memory within a distributed memory system can be accessed through an interleaving protocol. Interleaving memory across several cells allows for more uniform access to memory. For example, if a system includes two cells of memory which are connected together through a bus system, each cell may include four separate processors. Each cell may also include memory. By interleaving the memory in cell 1 with the memory in cell 2, all eight processors in the system have a more uniformed access to each memory location. Interleaving memory across the two cells also ensures consistency in latency delays for each processor in accessing memory locations. Interleaving memory across the two memory locations also reduces the possibility of bottlenecks when processors attempt to access or retrieve information from memory.
When interleaving is used in a distributed memory system, processors or devices which require access to memory must be able to determine the physical location of the portion of accessible memory.
While systems and methods for interleaving across distributed memory systems are known, their use has included a number of restrictions. For example, some prior distributed systems using interleaving required that the number of cells containing interleaved memory be equal to a power of two. The overall system memory could be contained in two, four, eight, sixteen, etc., different cells. However, problems arose if the overall system memory were contained in a number of cells which were not equal to a power of two. For example, the overall system memory could not be interleaved effectively across seven different cells without difficulty and special processing. Additionally, the amount of memory interleaved in each of the cells also had to be equal to a power of two. So a specific cell location could contain 2, 4, 8 or 16 gigabytes (GB) but not, easily, for example 5 or 13 GB. Also, interleaving across distributed memory cells was easily achievable only when the amount of memory within each cell was equal.
For example, suppose the memory contained within a system is distributed across four cells labeled 0, 1, 2, and 3, respectively. Further suppose each of cell 0 and cell 1 contain 8 GB of memory while cells 2 and 3 each contain 4 GB of memory. The overall system therefore contains 24 GB of memory. The distributed memory could be interleaved as follows. Since each of the four cells contains at least 4 GB of memory, the first interleave entry, entry 0, would contain 4 GB of memory from each of cell 0, 1, 2, and 3 for a total of 16 GB of memory (four from each of the four cells). All of the memory available in cell 2 and cell 3 have now been used in interleave entry 0. Cell 0 and cell 1 each contain four GB of unused memory. Interleave entry 1 would contain the 4 GB of memory from cell 0 and the 4 GB of memory from cell 1. Interleave entry 1 therefore contains 8 GB of memory, four from cell 0 and four from cell 1. The 24 GB of memory in the four cells have now been broken out into two interleave groups. The 24 GB of memory from the four cells can now be viewed as one contiguous block as follows. GB 0 through 15 are located in the lower half of cells 0, 1, 2, and 3, GB 16 through 23 are located in the upper portion of cells 0 and 1. This interleaving occurs at the cache line level. To a processor, the 24 GB of information appears to be one contiguous block. While the 24 GB of information appears to be one contiguous block, physically the 24 GB of information is distributed among four different cells.
In order to successfully access information contained within the memory, a processor would need to determine which cell contains a specific memory address. Prior interleaved distributed memory systems accomplished this through a one-to-one mapping between the logical memory and the physical address. For example, a look up table could be formed which consisted of 24 rows and four columns. The first column would contain the logical GB block, i.e., the most significant address bits representing 230  and greater address values which would range from 0 to 23. The second through the fourth columns would contain the physical address of the logical GB blocks within the respective cell. The first 16 rows of this table would identify interleave group 0 i.e., the first 16 GB of memory. Interleave group 1 would start on the 17th row of the table. When a specific processor had to access information stored in memory, the processor could identify the physical location of the memory address from this one-to-one map. These prior art systems have several disadvantages. For example, these methods are inflexible in their mappings so that each segment of physical memory requires a row entry mapping a contiguous portion of logical memory space to that physical location. The system is further inflexible in requiring predetermined, fixed physical block sizes of memory. These limitations result in both overhead requirements and performance issues if the physical memory is non-uniformly distributed over a large number of remote cells, each having its own configuration and distribution of memory resources.
Accordingly, a need exists for a more flexible approach to interleaving memory across the distributed memory system. A need further exists for a memory system and method of configuring and operating memory resources that readily accommodate cell numbers that are not integer powers of 2. Additionally, a need exists for a technique that allows the amount of memory within each cell to be equal to non-powers of 2. A further need exists for a system and method that eliminates the gaps in memory which result from each cell not having its memory configuration maximized so that all cells have equal memory spaces. Further, a need exists which allows a simplified table, or simplified method to determine the physical location of the memory address. A further need exists which will allow the amount of memory in the various cells to be different.
These and other objects, features and technical advantages are achieved by a system and method which according to one aspect of the invention, an interleaved method provides for accessing a contiguous logical address space formed by a plurality of m memories having respective overlapping address spaces. The memories are organized into memory segments, at least some of which have identical address spaces. The memory segments are arranged or organized into interleave groups such that all segments of an interleave group have completely overlapping address spaces. Thus, all of the segments are addressable by the same address data. However, not all segments having identical address space need be organized into one interleave group. Instead, groups may be formed to facilitate group address boundaries falling on multiples of a next group size. This allows use of the minimum number of address bits to be used to identify a group. An initial largest interleave group is selected and a corresponding first interleave entry is generated in a table. The interleave entry maps a corresponding initial logical address space into each of the memory segments corresponding to the first interleave group. A total memory size included thus far in the table is calculated and a determination is made if this total memory size places the start of any next group on a multiple of that group size. Otherwise, a search is conducted for an interleave group having a size that is evenly divisible into the boundary. These steps are repeated until all of the contiguous logical address space has been mapped onto the memories.
According to a feature of the invention, the interleave entries include designations of ones of the memories corresponding to memory segments constituting respective ones of the interleave groups.
According to another feature of the invention, each of the interleave entries is organized as one or more whole rows of a two dimensional table. The contiguous logical address space is addressable by a multibit address, such that a portion of the multibit address is used to designate a column of the table. A mask is created to use in combination with another portion of the address to designate a row of the table. The column and row data can then be used to access a value stored in the table and select one of the memories.
According to another feature of the invention, the table includes 2n columns and p rows wherein n and p are positive integer values. An ith one of the interleave groups includes si memory segments and r rows of the table such that r=2n/si is a positive integer value. Further, each of the interleave groups includes r=2n/si references to each of the memories corresponding to memory segments constituting a respective interleave group.
According to another feature of the invention, at least one of the interleave groups includes a number of memory segments that is not a whole power of two and/or one of the memory segments has an address space that is not a whole power of two.
According to another aspect of the invention, a method maps a contiguous logical address space into a plurality of memories organized into uniform size memory segments. The method includes organizing the memory segments into corresponding interleave groups and generating, for each of the interleave groups, an interleave entry mapping the logical address space into the memories. The interleave entries are ordered based on size except that smaller ones of the interleave entries are used to complete a block size such that they align to multiple integers of the interleaved group size.
According to a feature of the invention, a set of masks are used with binary address data to select one of the interleave entries.
According to another feature of the invention, non-overlapping first and second portions of a memory address are extracted from the address. The first portion is combined with a mask to obtain an interleave entry designator. The second portion is used to select a portion of the interleave entry designating one of the memories. With a particular one of the memories selected, a memory address in the designated memory is accessed based on a remaining portion of the memory address.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.