1. Field of the Invention
The present invention is in the field of computer systems, and, more particularly, is in the field of memory control systems for main memory used in combination with a copy back cache system.
2. Description of the Related Art
Cache systems are utilized in computers to provide a relatively small amount of high speed memory to hold data transferred from a main memory to a processor. Basically, the cache system maintains a copy of the data from selected portions of the main memory so that the processor can access the high speed memory in the cache system rather than the relatively slow main memory. When the processor accesses a location in the main memory that is not currently copied in the cache memory, new data must be stored in the cache memory. Generally, the new data is written to a cache memory location previously occupied by data from a different main memory location. Thus, the previous data must be discarded.
There are two types of cache memory systems which are identified by the method of handling data written to the cache from the processor. One type is a "copy through" cache system in which data written to the cache are also immediately written to the main memory so that the cache memory locations and the corresponding main memory locations always comprise the same data. Thus, when data in the copy through cache system needs to be discarded, it does not need to be written to the main memory. However, when the processor is frequently writing to the same memory locations, a substantial time penalty is incurred since the slower main memory is written each time.
The present invention is related to the second type of cache system in which data from the processor is initially written only to the cache memory and which is frequently referred to as a "copy back" or "write back" cache system, and which will be referred to herein as a copy back cache system. In a copy back cache system, the data is only copied back or written back to the main memory when new data needs to be stored in a cache memory location that has changed since it was first stored in the cache memory. This is an advantage over the copy through cache system when the same locations in the cache memory remain in the cache memory and are being frequently changed by the processor. One of the disadvantages of a copy back cache system is the penalty that must be paid on a cache miss when the cache location to be filled is "dirty", that is, the data in the location has changed since the line was filled from memory. When a dirty miss occurs, the changed line must be rewritten to main memory and then the new line of data must be retrieved from main memory to refill the cache line. Thus, for example, in a cache having 128-bit cache lines (i.e., four 32-bit double words per line), four 32-bit double words must be written to the main memory and four 32-bit double words must be retrieved from main memory. Thus, the penalty that must be paid is the time required to write the dirty line back to main memory and retrieve the new line.
Part of the time penalty results from the use of dynamic memory circuits in the main memory. Such dynamic memory circuits have the advantage that a greater amount of memory can be provided in a given integrated circuit size. However, one of the disadvantages of dynamic memory circuits is that the circuits are organized as an array of cell locations defined by rows and columns. For example a 256K.times.1 dynamic random access memory (RAM) includes nine address lines that are time multiplexed to provide the eighteen address inputs needed to address the 262,144 address locations in the RAM.
During a first portion of a memory read cycle, a first nine-bit portion of the address is applied to the nine address lines and a row address select (RAS) signal is activated to gate the nine bits into an internal row address buffer in the RAM and to initiate an access to one row of 512 rows of the memory array. Thereafter, a second nine-bit portion of the address is applied to the nine address lines and a column address select (CAS) signal is activated to gate the nine bits into an internal column address buffer. Each row of the memory array accessed by the RAS signal includes 512 storage cells, each of which provides an output to a column decoder. The outputs of the column address buffer are decoded by the column decoder to select one of the 512 outputs of the column outputs to be the data output of the dynamic RAM. The operation of the dynamic RAM for a write cycle is similar to a read cycle, except that external data is applied to the RAM and the column decoder routes the data to one of the columns of the selected row to replace the data previously in the selected cell in that row.
Larger dynamic RAMs can be provided by increasing the number of rows in the memory array, increasing the number of storage cells per row, or both. For example, 256K.times.4 dynamic RAMs are available which provide 1 Megabits (1,048,576 bits) of data storage where each row includes 2048 storage cells. Four storage cells are selected by each column address to provide output data or to store input data. The same storage capacity can be provided in a 1M.times.1 dynamic RAM which has 1,024 rows with 1,024 storage cells per row. In the case of a 1M.times.1 RAM, an additional address input line is provided to provide ten address lines that are multiplexed as before to select one of the 1,024 rows and to select one of the 1,024 columns (i.e., storage cells) in each row.
Each time a new row of the memory array is accessed, the RAS signal must first be returned to an inactive level to precondition the memory array for the next access. The minimum amount of time that the RAS signal must remain inactive before being activated for the next access is referred to as the RAS precharge time of the memory array. The precharge time varies with respect to the speed of the dynamic RAM, and for an exemplary high speed dynamic RAM may be in the 70-100 nanosecond range. In addition, when the RAS signal is activated to begin an access, the greatest portion of the total access time is RAS access time, that is the amount of time from the activation of the RAS signal until valid data is available on the output of the RAM. The row access time may be as much as 80 nanoseconds in the exemplary high speed dynamic RAM discussed herein. In contrast, the column access time, that is the amount of time from the activation of the CAS signal until the data is valid is typically less than half the row access time (e.g., approximately 25-30 nanoseconds in an exemplary high speed dynamic RAM).
Because of the penalty of the large row access times, many dynamic RAMs operate in the so-called page mode. That is, when a RAS signal is applied to the RAM to access a particular row of the memory array, more than one cell location in the accessed row can be read from or written to without applying a new RAS signal for each access. This is accomplished by changing only the CAS signal for each new access. The CAS signal must be deactivated for a minimum CAS precharge time between accesses. The CAS precharge time is approximately 25 nanoseconds in the exemplary high speed dynamic RAM discussed herein. Thus, the total time between accesses in the page mode is less than 60 nanoseconds as compared to the approximately 150 nanosecond access time when the RAS signal is changed for each access. This saves a substantial amount of time in accessing data from the same row of a memory, for example, when accessing a set of sequential instructions or data, or executing a loop in the same general memory locations, or the like.
The page mode of dynamic RAMs has found to be useful when the dynamic RAMs are included in main memory connected to a cache memory such as that found, for example, in the Intel.RTM. 80486 microprocessor. When a line of data is read from the main memory to be stored in the cache memory, each byte of the line of data is generally located within the same row of the memory array of the RAM. Thus, all the bytes of the line of data can be transferred from the main memory to the cache memory with only a single row access. However, theretofore, page mode of dynamic RAMs have not been used to its fullest extent with respect to copy back cache memory systems wherein a dirty line of data in the cache memory is swapped for a new line of data from the main memory. In particular, the line of data to be copied back to the main memory is generally located in a different page of the main memory. Thus, it is necessary to activate a first row of the memory array for the copy back portion of the data swap operation and then activate a second row of the memory array for the line fill portion of the swap operation. Thus, an additional 120 nanoseconds is included in the swap operation to accommodate the RAS precharge time and the greater RAS access time when the row access changes. The 120-nanosecond increase in the swap time adds up to a substantial amount of time overhead in operations where there are frequent copy back and line fill operations.
Some cache systems have responded to this problem by temporarily transferring the dirty data to a register, retrieving the new data from memory and then storing the dirty data to the memory from the register. This increases the amount of circuitry involved and does not solve the problem if a processor immediately accesses the cache system for new data that results in a dirty hit. Dirty data from a previous dirty hit that is already in the register must be transferred from the register to the memory before new dirty data from the current dirty hit can be transferred to the register and thus before beginning the read access to the memory. Thus, an appropriate solution is to reduce the amount of time for storing the dirty data and retrieving the new data from memory.