1. Field of the Invention
The present invention relates to a memory system for a semiconductor memory. More particularly, the present invention relates to a memory system which provides a plurality of virtual access channels to facilitate access by a plurality of memory masters.
2. Description of the Prior Art
Conventional data processing systems typically include multiple processors/processes which share a system memory. The multiple processors/processes (i.e., memory masters) access the system memory (e.g., general system memory or graphic frame buffer memory) in a multi-tasking manner. The memory masters can include central processing units (CPUs), graphics processors, PCI bus masters and EISA/ISA bus masters. Each memory master accesses portions of the system memory which exhibit an address locality, a time locality and/or a particular block size. It would therefore be desirable to have a memory system which allows multiple memory masters to access a system memory in an efficient manner. It would further be desirable if such a memory system could be dynamically modified to accommodate different types of memory masters.
FIG. 1 is a block diagram of a multi-processing system 100 which employs a shared memory architecture. System 100 includes processors 101a-101c, dedicated cache memories 102a-102c, dedicated cache controllers 103a-103c, system bus 104, global main memory 105 and memory controller 106. Processors 101a-101c share main memory 105 through common parallel system bus 104. Cache memories 102a-102c are typically constructed using relatively high speed SRAM arrays. Main memory 105 is typically constructed using relatively low speed and low cost DRAM arrays. Systems such as system 100 are described in the following references: (1) xe2x80x9cProtocols Keep Data Consistentxe2x80x9d, John Gallant, EDN Mar. 14, 1991, pp.41-50 and (2) xe2x80x9cHigh-Speed Memory Systemsxe2x80x9d, A. V. Pohm and O. P. Agrawal, Reston Publishing, 1983, pp.79-83.
Dedicated cache memories 102a-102c reduce the frequency with which each of processors 101a-10c access main memory 105. This reduces the amount of traffic on system bus 104. However, cache memories 102a-102c are relatively expensive. In system 100, an expensive cache memory must be added for each added processor. In addition, system 100 requires control logic to maintain the consistency of data in cache memories 102a-102c and main memory 105 (i.e., cache coherence). The problem of cache coherence is described in more detail in xe2x80x9cScalable Shared Memory Multiprocessorsxe2x80x9d, M. Dubois and S. S. Thakkar, Kluwer Academic Publishers, 1992, pp.153-166. The control logic required to provide cache coherence increases the cost and decreases the performance of system 100. In addition, the efficiency of main memory 105 and system bus 104 suffers if the data values fetched into cache memories 102a-102c are not used.
FIG. 2 is a block diagram of another conventional multi-processor system 200 which includes a global main memory 204 which is divided into modules 206a-206c. Each of main memory modules 206a-206c is attached to a single corresponding cache memory module 205a-205c, respectively. Each of cache memory modules 205a-205c is attached to a main memory bus 202. Processors 201a-201c are also attached to main bus 202. Processors 201a-201c share cache memory modules 205a-205c and main memory modules 206a-206c. System 200 is described in, xe2x80x9cHigh-Speed Memory Systemsxe2x80x9d, Pohm et al., pp.75-79. When the number of processors is approximately equal to the number of memory modules (i.e., cache memory modules), cache thrashing can occur. Cache thrashing refers to the constant replacement of cache lines. Cache thrashing substantially degrades system performance.
To minimize the cost of SRAM cache memories, some prior art systems use additional prefetch buffers for instructions and data. These prefetch buffers increase the cache-hit rate without requiring large cache memories. Such prefetch buffers are described in PCT Patent Application PCT/US93/01814 (WO 93/18459), entitled xe2x80x9cPrefetching Into a Cache to Minimize Main Memory Access Time and Cache Size in a Computer Systemxe2x80x9d by Karnamadakala Krishnamohan et al. The prefetch buffers are used in a traditional separate cache memory configuration, and memory bandwidth is consumed by both the prefetch operations and the caching operations. A robust prefetch algorithm (with a consistently high probability of prefetching the correct information) and an adequate cache size and organization (to provide a high cache hit rate) is required to deliver any performance improvement over traditional caching schemes.
Other conventional systems use the sense-amplifiers of a DRAM array as a cache memory. (See, e.g., PCT Patent Publication PCT/US91/02590, by M. Farmwald et al.) Using the sense-amplifiers of a DRAM array as cache memory provides low cost, high transfer bandwidth between the main memory and the cache memory. The cache hit access time, equal to the time required to perform a CAS (column access) operation, is relatively short. However, the cache miss access time of such a system is substantially longer than the normal memory access time of the DRAM array (without using the sense amplifiers as a cache memory). This is because when the sense amplifiers are used as cache memory, the DRAM array is kept in the page mode (or activated mode) even when the DRAM array is not being accessed. A cache miss therefore requires that the DRAM array perform a precharge operation followed by RAS (row access) and CAS (column access) operations. The time required to perform the precharge operation (i.e., the precharge time) is approximately twice as long as the time required to perform the RAS operation. The total memory access time is therefore equal to the sum of the precharge time, the RAS access time and the CAS access time of the DRAM array. In contrast, during normal operation of the DRAM array, the DRAM array is in precharged mode when it is not being accessed, and the memory access time is equal to the RAS access time plus the CAS access time of the DRAM array.
Another prior art cache memory system includes an SRAM cache memory which is integrated into a DRAM array. The DRAM array includes four banks which collectively serve as the main system memory. The SRAM cache memory includes a cache row register which has the capacity to store a complete row of data from one of the banks of the DRAM array. A last row read (LRR) address latch stores the address of the last row read from the DRAM array. When the row address of a current read access is equal to the row address stored in the LRR address latch, the requested data values are read from the row register, rather than the DRAM array. Thus, there is one cache entry in the cache row register which is shared by each of the four banks in the DRAM array. This prior art memory system is described in more detail in DM 2202 EDRAM 1 MBxc3x974 Enhanced Dynamic RAM, Preliminary Datasheet, Ramtron International Corp., pp. 1-18.
It is therefore desirable to have a memory system which overcomes the previously described shortcomings of the prior art memory systems.
In accordance with the present invention, a memory system includes a main memory and a plurality of virtual access channels connected in parallel to the main memory. The main memory typically includes a plurality of memory banks. Each of the virtual access channels includes a set of memory access resources for accessing the main memory. These memory access resources can include, for example, cache resources, burst access control resources, and memory precharge resources. Each of the virtual access channels is independently addressable by an external memory master.
By enabling the virtual access channels to be addressed by external memory masters, the virtual access channels can be flexibly assigned to serve different memory masters as required by the data processing system to which the memory system is connected. For example, one memory master can be assigned to access two virtual access channels, while several other memory masters can be assigned to share the access of a single virtual access channel. These assignments can be static or can be changed dynamically during normal operation of the memory system. These assignments can also be modified for connection to different data processing systems.
In one embodiment, the virtual access channels include a plurality of cacheable virtual access channels which perform caching operations. In such an embodiment, each cacheable virtual access channel includes a cache data memory for storing one or more cache data entries, and a corresponding cache address memory for storing one or more corresponding cache address entries. By assigning a cacheable virtual access channel to service each of the memory masters, each of the memory masters is advantageously provided with a dedicated cache memory resource. The virtual access channels can also include a non-cacheable virtual access channel which enables the cacheable virtual access channels to be bypassed when a cache miss occurs.
The present invention also includes a method of accessing a memory array which includes the steps of: (1) coupling a virtual access system to the memory array, wherein the virtual access system has a is plurality of virtual access channels connected in parallel to the memory array, each virtual access channel providing a set of memory access resources for accessing the memory array, (2) assigning each of the memory masters to access one or more of the virtual access channels, (3) providing an access address from the memory masters to the virtual access system, and (4) accessing a selected one of the virtual access channels in response to the access address.
This method can also include the steps of (5) storing a cache entry and a corresponding cache address entry in the selected virtual access channel, (6) comparing the access address with the cache address entry, and (7) accessing the cache entry if the access address matches the cache address entry. If the access address does not match the cache address entry, then the memory array can be accessed through a bus bypass circuit. In this case, the cache entry of the selected virtual access channel is updated to reflect the data value accessed through the bus bypass circuit, and the cache address entry is updated to reflect the address accessed.
In a variation of this method, two of the virtual access channels can be activated at the same time, with one of the virtual access channels performing operations at the interface between the virtual access channels and the memory masters, while another one of the virtual access channels is performing operations at the interface between the virtual access channels and the memory array. This advantageously provides for improved concurrency of operations within the memory system.
In another variation of this method, the operating modes of each of the virtual access channels are independently programmed. For example, each virtual access channel can be individually programmed to have specific cache chaining modes, burst lengths and precharge modes. This enables the virtual access channels to be individually tailored to best serve the operating needs of the corresponding memory master.