1. Field of the Invention
This invention relates generally to a cache memory system and, more particularly, to a dynamic random access memory used as a cache memory in a cache memory system and having an improved cache data access time.
2. Discussion of the Related Art
Historically, stand-alone dynamic random access memory (DRAM) chips begin an array access by the activation of a Row Address Strobe (RAS) signal. With reference now to FIG. 1, a block diagram of a standard stand-alone DRAM chip 10 is shown. DRAM chip 10 includes address buffers 12 for receiving address inputs 14 (A.sub.0 -A.sub.n), a row pre-decoder 16, row drivers 18, a row redundancy decoder 20, redundant row drivers 22, a memory array or bank 24, and a sense amplifier and bit decoder 26. Sense amplifier and bit decoder circuit 26 is coupled to data I/O buffers 28. Data input/output (I/O) buffers 28 receive data on I/O lines 30 to be written into memory array 24. Data I/O buffers 28 further output data read from the memory array 24 on I/O lines 30. A read/write signal input (not shown) determines whether a read operation or a write operation is carried out.
Referring still to FIG. 1, DRAM chip 10 also includes a RAS input 32, buffer 34, Column Access Strobe (CAS) input 36, and buffer 38. Basic control signals for the standard stand-alone DRAM 10 include the RAS input 32, the CAS input 36, and address inputs 14. A RAS signal 32 is presented to the DRAM chip 10 once an appropriate memory control logic (not shown) has decided to access the array 24 of DRAM chip 10 in accordance with the row address and column address presented to the chip via address inputs 14. In the typical DRAM chip 10, an entire row access proceeds from the RAS signal 32. If an activation of a memory array access by the memory control logic circuit (not shown) was speculative and subsequently canceled, then an entire chip cycle time must pass before the chip 10 (i.e., array 24) can be accessed again.
In continuation of the above discussion, two things actually have to happen before accessing data in a DRAM array, i.e., to actually activate a wordline. First, the address is decoded down to a wordline or a group of wordlines. Then, an appropriate wordline driver or group of wordline drivers must be enabled. With typical DRAMs, the memory array is so dense, and there are so many rows and columns which make up so many total number of cells, that the likelihood of having a defective cell is high. Thus, a typical DRAM will have a small number of redundant rows and redundant columns. Most of the time, there are only a few defective rows in an array. In any case, accessing data in the DRAM array also includes looking at redundant decoder outputs. An additional function must be performed before activating a row driver after giving the row address. The input address must be checked to determine that it is not one of a defective row. In the event that the row address is one of a defective row, the redundant row decoder would need to recognize the address as a defective row. Basically, the redundant row decoder of the DRAM is programmed with the defective addresses when the DRAM part is manufactured. If the redundant row decoder detects a match with an incoming address, the redundant row decoder outputs a positive match signal and inhibits driving the wordline associated with the incoming address because it is known to be defective. The redundancy row decoder substitutes instead a redundant wordline. The redundant row driver receives the decoded redundant address and goes to a special section of the DRAM array that has just redundant rows.
In U.S. Pat. No. 5,469,559, Parks et al., assigned to DELL USA, of Austin, Tex., a method and apparatus for refreshing a selected portion of a dynamic random access memory of a computer is disclosed. The '559 DRAM subsystem includes a memory controller having a RAM device for storing a plurality of region descriptors used to inhibit the refresh of address ranges of the DRAM that do not contain valid data. Logic circuitry is connected between a refresh period timer and the RAM device for inhibiting receipt by a RAS generator of a refresh pulse when a generated refresh address falls within the refresh address range defined by the region descriptor. A refresh address output by a refresh address counter is compared to the region descriptors in the RAM device, and if the region descriptors indicate that the row addressed by the refresh address does not contain valid data, the RAS generator is inhibited from producing a RAS pulse. Logic instructions are inserted into memory allocation and memory deallocation subroutines of the computer's operating system for writing the region descriptors to the RAM device. Note that while the '559 discloses inhibiting a receipt of a refresh pulse by an RAS generator, the '559 patent does not provide an ability to terminate a DRAM access without suffering a full DRAM cycle time. The '559 patent is thus concerned with a way of gating the RAS signal (under certain circumstances) to the DRAM and controlling a timing of when a refresh operation is carried out. The '559 patent does not deal with accessing of information in the DRAM faster, that is, not related to DRAM performance.
A current direction in the industry is to merge DRAM with logic. For example, ways are being sought to use DRAM instead of SRAM, since DRAM is denser and less expensive to manufacture. One problem, however, is that the DRAM must be fast enough to enable SRAM to be replaced by DRAM.
DRAM array circuits considered for use in merged DRAM/logic applications can require a very high performance, especially when considered for use in place of a fast static random access memory (SRAM). In such an instance, when a conventional DRAM architecture is used in a DRAM/logic application, two problems arise. First, an abort of a DRAM access requires a full cycle before the DRAM can be accessed again. Second, performance is limited by requiring all row operations for access be performed in sequence based on the strobe signal RAS. In addition, the problem of allowing RAS to trigger a DRAM access cycle prematurely has always existed. A controller must either decide to speculatively access the memory and risk throwing away a full DRAM cycle (if the accessed address holds other data) or wait to begin an access until after the data is known to be present in the DRAM array.
With respect to the use and operation of a typical DRAM chip, according to known DRAM specifications, an address can be presented to the DRAM chip with respect to the RAS input for use in selecting a wordline or a group of wordlines of the DRAM chip's memory array. Typically, the address input is allowed to be presented to the DRAM chip as late as the receipt of the RAS strobe signal. The DRAM chip thus has a zero set up time once the RAS strobe is received. The address input can change up until the receipt of the RAS strobe. At receipt of the RAS strobe, the address input becomes valid and an array access occurs. Because valid address input information cannot be relied upon earlier than the occurrence of the RAS strobe, no operation can be performed by the conventional DRAM concerning the address input until receipt of the RAS strobe. The RAS strobe is the only indication to the conventional DRAM that the address input is valid. Any time before the occurrence of the RAS strobe, the address input is not guaranteed to be valid.
Furthermore, in a conventional memory system, a DRAM memory is not accessed until the memory controller has searched the DRAM tag array to determine if the desired data resides within the DRAM memory. Therefore, even though the address of the desired memory may be available, an access is not begun. If a memory controller searches the DRAM tag array and speculatively accesses the DRAM memory at the same time (i.e., issues the RAS strobe signal), then time to access data from DRAM memory is improved but at the cost of much worse DRAM memory availability. When a DRAM is accessed unsuccessfully because the DRAM tag array determines that the desired data is not in the DRAM memory, it costs many processor cycles before the DRAM memory array is next available to process another access request. As a result, in a hierarchical memory system where the cache has a ninety percent (90%) hit rate, unsuccessful accesses to the DRAM memory approach 90%, thus greatly diminishing DRAM memory availability and also wasting power.
A high speed cache application can be carried out using a memory hierarchy 50 including a processor 51 having a CPU 52 and a level 1 (L1) SRAM 54. Memory hierarchy 50 further includes a level 2 (L2) SRAM 56 and a level 3 (L3) DRAM 58, wherein the L3 DRAM is a conventional DRAM. See, for example, the memory hierarchy of FIG. 2. With the memory hierarchy of FIG. 2, when there is a request for data, the memory controller 60 will check to see if the requested data is in the SRAM 56 or DRAM 58 using appropriate tag arrays or region descriptors. If the requested data is in the L2 SRAM 56, then it is necessary to inhibit, interrupt, or prevent the access to the DRAM 58 to achieve a performance benefit as discussed herein. If the access to the DRAM 58 is not inhibited, then a wait of one complete DRAM access cycle is required before the respective prior art DRAM 58 is ready or available to be accessed again (i.e., for a next occurring access). Thus, at a minimum, a wait of one complete DRAM access cycle (i.e., on the order of 80-100 ns) or a wait of many processor cycles (i.e., on the order of many times 5 ns) may be required. As a result, the prior art DRAM 58 cannot allow the high speed cache application to operate the memory hierarchy system 50 at a highest optimal frequency.
Referring still to FIG. 2, when a cache memory controller 60 is given a request for data from a processor 51, the memory controller 60 first determines where the data resides. The cache memory controller 60 must check the region descriptors for the L2 cache 56 (i.e., SRAM) and the L3 cached DRAM 58. Often times, the memory controller 60 will check both places simultaneously. With respect to the SRAM 56, it takes less time to check to see if the requested data is in SRAM since the SRAM represents a much smaller address space, thus taking much less time to search. In other words, the region descriptors for data in SRAM are fewer than that of DRAM, thus searching happens more quickly with respect to the SRAM. In parallel, searching occurs with the region descriptors of DRAM 58. Because the memory controller 60 gets an answer as to whether the data is in SRAM 56 cache sooner, the memory controller 60 can then abort a speculative access to the DRAM 58. It is a disadvantage if such an abort occurs after an access to the DRAM memory has started because a full DRAM cycle must now be completed before a new memory access to the DRAM can begin.
In further discussion of the above, as mentioned, the memory controller 60 can determine if the requested data is in SRAM 56 while searching the descriptors of the DRAM 58. If it turns out that the SRAM 56 is a miss (i.e., not in the L2 cache) and the requested data address is in the region descriptor for the DRAM 58, then the memory controller 60 knows to go ahead and access the DRAM 58. Still further, if it turns out that the address ends up missing in the L2 cache 56 and it is also not in the region descriptor for the DRAM 58, the memory controller 60 will determine not to issue an access to the DRAM 58, since the requested data is not there. Instead, the controller 60 goes to a more complicated task of issuing a request for the data from some other location such as tape, hard drive or wherever it might be, but further up in the memory structure hierarchy.
Thus it is desirable to provide a DRAM for use in a high speed cache DRAM/logic application having an improved DRAM data access time, as well as enable the high speed cache application to operate at a highest optimal frequency.