1. Field of the Invention
The present invention relates to cache memory in a computer system, and more particularly to a method and apparatus for refreshing dynamic random access memory (DRAM) cache memory without refresh penalty.
2. Description of the Related Art
In a general computer system, a memory hierarchy exists to support a central processing unit (CPU) with data storage capability. A type of memory device used most frequently as main memory in computers is dynamic random access memory (DRAM). This is because DRAM is comparatively low cost and high in density, facilitating storage of large quantities of data in a small area of a computer system. However, data stored in DRAM must be refreshed periodically, or otherwise the data will be lost due to charge leakage from DRAM cells. What is traded off is a higher cost in system overhead needed to refresh data in DRAMs and a speed penalty caused thereby.
When it is necessary to interface a fast processor with a slower memory, the overall system speed is slowed because the processor cycle time is faster than the data access time of the memory. One way to handle such a situation is to use higher-speed embedded cache memory to decrease the effective time to access the slower memory (i.e., main memory of a computer system).
Cache memories are buffers that hold selected data from larger and usually slower main memory so that processors can fetch needed data from the selected data more rapidly. Cache memory is based on the principle that certain data has a higher probability at any given time of being selected next by a processor than other data. If the data of higher probability to be used by the processor is stored in cache memory with a faster access time, then the average speed to access the data will be increased. Thus, cache memory, which is usually smaller and faster than main memory, is used to store the most frequently accessed data of the main memory.
In the memory hierarchy of modem computer processing systems, cache memory is generally located immediately below the highest level processor, i.e., central processing unit (CPU). Typically, the cache memory is divided into multiple levels, for example, level one (L1) cache and level two (L2) cache. Some of the levels (e.g., level one (L1) cache) may be on the same chip as the CPU, thus referred to as xe2x80x9con-chipxe2x80x9d. Such an on-chip cache memory is generally configured with static random access memory (SRAM) which has lower density but higher speed. Other levels of cache memory (e.g., level two (L2) cache ) may be located separately from the CPU and configured with DRAM which has higher density but lower speed. A larger size cache memory can be provided by using high density DRAM as cache memories of processors in a computer system, so that xe2x80x98cache miss ratexe2x80x99 (this is explained in detail below) can be reduced.
Since data stored in DRAMs are destroyed after being idle for a period of time, DRAMs require refresh cycles to restore the data in DRAMs. Memory cells in DRAMs must be periodically refreshed within a certain period of time. This time period is referred to as xe2x80x9cretention timexe2x80x9d. Depending on the technology and chip temperature, the retention time may vary in the range of a few milli-seconds to hundreds of milli-seconds. Data refresh is accomplished by accessing each row in memory arrays, one row at a time. When the memory arrays are accessed to be refreshed, data stored in memory cells of the arrays are read to sense-amplifiers, and immediately written back to the memory cells. A capacitor corresponding to each memory cell is thus recharged to its initial value. Such required refresh cycles in DRAMs cause significant delay due to inaccessibility of data in the DRAM while rows of the memory arrays are in the middle of refresh cycles. In other words, during the time that memory cells of DRAM cache memory are refreshed, no other operation can take place in that area of the memory. This xe2x80x9cdead timexe2x80x9d slows down the effective speed of accessing data in memories in a computer system. Although data refresh time may be reduced by using higher quality capacitors for memory cells in cache memory, requirement of data refresh is still a significant factor causing delay in the timing of a computer system.
U.S. Pat. No. 6,028,804 to Leung proposes a solution to the above problem. Leung describes a method for operating a peak frequency of DRAM cache memory higher than a peak external access frequency of the CPU. The refresh cycles for the DRAM refresh operation may be hidden, so that the memory system appears to the outside world like SRAM. However, a drawback in such a conventional system is that DRAM cache memory employed in such a conventional computer system must be able to operate faster than the CPU, which is practically very difficult and, if practical, requires high cost.
Therefore, an improved method of refreshing data stored in DRAM cache memory is needed for eliminating excess delay.
It is an object of the present invention to provide a method and apparatus for refreshing data in a cache memory in a computer system without refresh penalty.
To achieve the above and other objects, the present invention provides a method for refreshing data in a cache memory which is accessed by a processor in a computer system, wherein the method comprises the steps of: (a) verifying a request address made by the processor with TAG addresses stored in the cache memory; (b) accessing a wordline corresponding to the request address in one of a plurality of sub-arrays of the cache memory, when the request address is verified in step (a); (c) generating refresh addresses to refresh data in the plurality of sub-arrays other than the one sub-array in the cache memory, wherein each of the refresh addresses addresses a corresponding wordline in each of the plurality of sub-arrays other than the one sub-array; and (d) performing read/write operation on the wordline accessed by the request address in the one sub-array, while data in the plurality of sub-arrays other than the one sub-array are refreshed. The read/write operation with respect to the one sub-array and refreshing data in the plurality of sub-arrays other than the one sub-array are preferably performed simultaneously. The step (a) of the above method may include the steps of storing TAG addresses in a TAG memory, each TAG address being associated with a corresponding wordline data stored in the cache memory; comparing the request address with the TAG addresses; and selecting a TAG address identical with the request address. The step (c) of the above method may include the steps of providing data refresh timing of the plurality of sub-arrays other than the one sub-array in the cache memory by tracking wordlines of the one sub-array, and resetting the data refresh timing of the plurality of sub-arrays other than the one sub-array when data in the one sub-array is refreshed. For example, the refresh timing may be provided by a refresh counter. The refresh counter tracks the wordline addresses for each sub-array to be refreshed. When one wordline data is refreshed, the refresh counter generates the next address of the sub-array for refresh. When all the wordlines in the sub-array are refreshed, the refresh counter will reset to zero and starts to refresh the first wordline again in the next refresh cycle.
The present invention also provides an apparatus for refreshing data in a cache memory accessible by a processor in a computer system, wherein the apparatus comprises a memory controller for detecting a request address made by the processor; a comparator for comparing the request address detected by the memory controller with addresses associated with data stored in the cache memory; and a refresh controller for generating refresh addresses to refresh data stored in the cache memory when the comparator finds an address identical with the request address, wherein the request address accesses a wordline in a first sub-array of the cache memory and each of the refresh addresses accesses a corresponding wordline in each of sub-arrays other than the first sub-array. The comparator preferably receives TAG addresses from a TAG memory in the cache memory, and each of the TAG addresses is associated with a portion of data stored in a data memory of the cache memory. The refresh controller of the present invention preferably includes refresh address generators for generating the refresh addresses. Each refresh address generator may be associated with each of the sub-arrays other than the first sub-array to generate a refresh address to refresh data stored in a corresponding one of the sub-arrays other than the first sub-array, and may also include a refresh counter for providing refresh timing for a corresponding one of the sub-arrays other than the first sub-array. The refresh counter may reset the refresh timing to an initial value when data in a sub-array associated with the refresh counter is refreshed.