1. Field of the Invention
The present invention relates to a memory system for a semiconductor memory. More particularly, the present invention relates to a memory system which provides a plurality of virtual access channels to facilitate access by a plurality of memory masters.
2. Description of the Prior Art
Conventional data processing systems typically include multiple processors/processes which share a system memory. The multiple processors/processes (i.e., memory masters) access the system memory (e.g., general system memory or graphic frame buffer memory) in a multi-tasking manner. The memory masters can include central processing units (CPUs), graphics processors, PCI bus masters and EISA/ISA bus masters. Each memory master accesses portions of the system memory which exhibit an address locality, a time locality and/or a particular block size. It would therefore be desirable to have a memory system which allows multiple memory masters to access a system memory in an efficient manner. It would further be desirable if such a memory system could be dynamically modified to accommodate different types of memory masters.
FIG. 1 is a block diagram of a multi-processing system 100 which employs a shared memory architecture. System 100 includes processors 101a-100c, dedicated cache memories 102a-102c, dedicated cache controllers 103a-103c, system bus 104, global main memory 105 and memory controller 106. Processors 101a-101c share main memory 105 through common parallel system bus 104. Cache memories 102a-102c are typically constructed using relatively high speed SRAM arrays. Main memory 105 is typically constructed using relatively low speed and low cost DRAM arrays. Systems such as system 100 are described in the following references: (1) "Protocols Keep Data Consistent", John Gallant, EDN Mar. 14, 1991, pp.41-50 and (2) "High-Speed Memory Systems", A. V. Pohm and O. P. Agrawal, Reston Publishing, 1983, pp.79-83.
Dedicated cache memories 102a-102c reduce the frequency with which each of processors 101a-101c access main memory 105. This reduces the amount of traffic on system bus 104. However, cache memories 102a-102c are relatively expensive. In system 100, an expensive cache memory must be added for each added processor. In addition, system 100 requires control logic to maintain the consistency of data in cache memories 102a-102c and main memory 105 (i.e., cache coherence). The problem of cache coherence is described in more detail in "Scalable Shared Memory Multiprocessors", M. Dubois and S. S. Thakkar, Kluwer Academic Publishers, 1992, pp.153-166. The control logic required to provide cache coherence increases the cost and decreases the performance of system 100. In addition, the efficiency of main memory 105 and system bus 104 suffers if the data values fetched into cache memories 102a-102c are not used.
FIG. 2 is a block diagram of another conventional multi-processor system 200 which includes a global main memory 204 which is divided into modules 206a-206c. Each of main memory modules 206a-206c is attached to a single corresponding cache memory module 205a-205c, respectively. Each of cache memory modules 205a-205c is attached to a main memory bus 202. Processors 201a-201c are also attached to main bus 202. Processors 201a-201c share cache memory modules 205a-205c and main memory modules 206a-206c. System 200 is described in, "High-Speed Memory Systems", Pohm et al., pp.75-79. When the number of processors is approximately equal to the number of memory modules (i.e., cache memory modules), cache thrashing can occur. Cache thrashing refers to the constant replacement of cache lines. Cache thrashing substantially degrades system performance.
To minimize the cost of SRAM cache memories, some prior art systems use additional prefetch buffers for instructions and data. These prefetch buffers increase the cache-hit rate without requiring large cache memories. Such prefetch buffers are described in PCT Patent Application PCT/US93/01814 (WO 93/18459), entitled "Prefetching Into a Caches to Minimize Main Memory Access Time and Cache Size in a Computer System" by Karnamadakala Krishnamohan et al. The prefetch buffers are used in a traditional separate cache memory configuration, and memory bandwidth is consumed by both the prefetch operations and the caching operations. A robust prefetch algorithm (with a consistently high probability of prefetching the correct information) and. an adequate cache size and organization (to provide a high cache hit rate) is required to deliver any performance improvement over traditional caching schemes.
Other conventional systems use the sense-amplifiers of a DRAM array as a cache memory. (See, e.g., PCT Patent Publication PCT/US91/02590, by M. Farmwald et al.) Using the sense-amplifiers of a DRAM array as cache memory provides low cost, high transfer bandwidth between the main memory and the cache memory. The cache hit access time, equal to the time required to perform a CAS (column access) operation, is relatively short. However, the cache miss access time of such a system is substantially longer than the normal memory access time of the DRAM array (without using the sense amplifiers as a cache memory). This is because when the sense amplifiers are used as cache memory, the DRAM array is kept in the page mode (or activated mode) even when the DRAM array is not being accessed. A cache miss therefore requires that the DRAM array perform a precharge operation followed by RAS (row access) and CAS (column access) operations. The time required to perform the precharge operation (i.e., the precharge time) is approximately twice as long as the time required to perform the RAS operation. The total memory access time is therefore equal to the sum of the precharge time, the RAS access time and the CAS access time of the DRAM array. In contrast, during normal operation of the DRAM array, the DRAM array is in precharged mode when it is not being accessed, and the memory access time is equal to the RAS access time plus the CAS access time of the DRAM array.
Another prior art cache memory system includes an SRAM cache memory which is integrated into a DRAM array. The DRAM array includes four banks which collectively serve as the main system memory. The SRAM cache memory includes a cache row register which has the capacity to store a complete row of data from one of the banks of the DRAM array. A last row read (LRR) address latch stores the address of the last row read from the DRAM array. When the row address of a current read access is equal to the row address stored in the LRR address latch, the requested data values are read from the row register, rather than the DRAM array. Thus, there is one cache entry in the cache row register which is shared by each of the four banks in the DRAM array. This prior art memory system is described in more detail in DM 2202 EDRAM 1MB.times.4 Enhanced Dynamic RAM, Preliminary Datasheet, Ramtron International Corp., pp. 1-18.
It is therefore desirable to have a memory system which overcomes the previously described shortcomings of the prior art memory systems.