The field of the present invention relates generally to digital processing systems which have cache memories and more particularly to digital processing systems and cache memories which employ writeback cache techniques.
Digital processing systems often employ a cache memory which is designed to operate faster than another memory device which may be referred to as "main memory". FIG. 1 shows a generalized example of a digital processing system which includes a processor, embodied in the form of a microprocessor 11, and a bus 12 along with cache memory 14 and the main memory 16. The microprocessor 11, the cache 14, and the DRAM 16 are all coupled to the bus 12 in order to communicate information back and forth between the various components of this system. Also coupled to the bus 12 is an I/O controller 18 which, as shown in FIG. 1, supports two input/output devices 20 and 22. The example shown in FIG. 1 may be a general purpose computer system or a dedicated processing system in, for example, a printer or a video game system. The microprocessor 11 obtains information from memory and performs arithmetic and logical operations on the fetched information and writes back results into memory. It has been observed that the microprocessor often operates on only a very small subset of the information resident in memory over a period of time. The cache 14, which is sometimes referred to as a level 2 cache and the cache 24, which is sometimes referred to as a level 1 cache, may be used to store a small subset of the information resident in the DRAM 16 in order to speed up the operation of the system. It will be appreciated that the level 1 cache is typically located on the same single semiconductor substrate which includes the microprocessor 11. It will also be appreciated that the level 1 cache may be a separate integrated circuit housed within the same packaging which houses the integrated circuit that forms the microprocessor 11.
FIG. 2 shows a typical example of a cache memory. The cache memory includes two arrays, one for data and the other for tag addresses. The data array of the cache holds a copy of a portion of information (referred to as a line of information) of the main memory, such as the memory DRAM 16 shown in FIG. 1. The tag information stored in the tag array 34 determines or specifies the memory location associated with the data from the main memory. Typically, each cache line in the tag array 34 is associated with a corresponding cache line in the data array 32. As an example, FIG. 2 shows two cache lines 41a and 42a in the tag array 34 which correspond to cache lines 41b and 42b respectively of the data cache array 32. Each cache line includes a plurality of memory cells along a row. Each memory cell along a row may be considered to form a column with corresponding memory cells in the same column in other rows of the array. Thus, for example, the column 44 of the tag array 34 contains a plurality of dirty bit memory cells for storing dirty bits which will be discussed below.
When information is to be written into the memory (e.g. written from the microprocessor 11 through the bus 12 into the SRAM cache 14 and the DRAM 16) it can be either written directly into the main memory as well as the caches or only into the caches. When the data is written only into the caches, the caches hold the latest (updated) piece of information. The caches that support this mechanism of writing data are called writeback caches. The advantage of writeback caches is that the data traffic is limited only between the processor and the caches, leaving the main memory free for other operations (e.g. a DMA operation). The main memory, such as the DRAM 16, has old or stale information which has to be updated when the processor needs to use the cache line holding the updated copy of the data for other purposes. Each cache line has a special indicator to determine whether the information resident in these lines is the updated version of the information resident in the main memory. This special indicator is referred to as a "dirty bit" and if a cache line has the latest copy of a piece of information then this indicator, stored in the dirty bit memory cell of that cache line, will indicate that the cache line contains the latest copy of the information. As shown in FIG. 2, cache line 41a which corresponds to cache line 41b in the data cache array contains a memory cell 39 which is used to store a dirty bit which specifies the status of the information in the cache line relative to the corresponding information in main memory.
FIG. 3 shows an example of a prior art memory cell 39 which is used to store the dirty bit for a particular cache line. The memory cell 39 includes two data line electrodes 51 and 52 which are referred to as bit and bit bar respectively. A word line 53 controls the writing and reading from the memory cell 39 which is formed by the cross coupled inverters 56 and 57 which form a bistable static memory cell as is known in the art. Pass transistors 54 and 55 couple respective outputs from the inverters 56 and 57 to the data lines 51 and 52.
If the processor, such as the microprocessor 11 of FIG. 1 needs to write to three different cache lines in the cache memory, such as cache 14, in a writeback cache manner, then the microprocessor 11 must perform six operations, two for each cache line. In particular, the microprocessor must determine if the dirty bit is to be set by reading the dirty bit in the particular cache line and if the bit is not set indicating that the line is not dirty then it must set the dirty bit of the particular cache line. It is assumed that the writing operations will be setting the dirty bits of the respective cache lines because new data or updated data is being stored in each cache line. An alternative approach in the prior art which seeks to reduce the number of steps required to perform this operation involves accessing each cache line to determine if the dirty bit is to be set and storing the result (which shows the state of the cache line's dirty bit) in a buffer. The processor must then look for empty cycles where the processor is not using the cache and then it must update the dirty bit in each cache line where the buffer indicates that the dirty bit for the cache line is to be set to indicate that the cache line is dirty. This approach requires extra pathways and requires managing the buffer to avoid overflow situations and otherwise maintaining the buffer. Yet another approach to accomplish setting the dirty bit is to associate the dirty bit with the data in data array (e.g. array 32 in FIG. 2) rather than the tag array (e.g. array 34 of FIG. 2). The dirty bit is updated when the data is written into the cache, freeing up the tag for a subsequent access. A major disadvantage of this scheme is that the data of the entire cache line should be accessible in one cycle. In most modern microprocessors, the line sizes are large, so that the scheme is practically unimplementable.
It is desirable to provide a simple and elegant solution to the problem of setting the dirty bit as discussed above while at the same time providing improved performance.