1. Field of the Invention
The present invention relates to cache memories in a multiprocessor computer system and more specifically to cache tag addresses and their controllers in such a multiprocessor system.
2. Art Background
In a computer system, caches are used for storing selected memory data in a random-access memory ("RAM") that is readily accessible to the processor. A cache typically includes a RAM for storing the memory data, a tag RAM for storing the indices, i.e. tags, to this memory data, and associated control logic for this tag RAM.
In a multiprocessor system with each processor having its own cache, the caches must be kept consistent as one processor may cause another processor's cache to contain invalid data by updating a memory data. Consistency thus requires that if one processor updates the memory data cached in a different processor, the other processor must know that the data has been altered. In a bus-based system, all caches can be updated by "snooping": if all caches are connected to the same system bus, these caches monitor the system bus to determine if a particular address, which may be cached, is being updated by accessing and reading their own tag RAM. If the referenced memory location is cached, the tag RAM may have to be updated. Thus, the tag data must be read out, and possibly modified, and written back to the tag RAM. If each memory transaction takes several cycles, this read-modify-write sequence can take place over several cycles as well.
Reference is now made to FIG. 1, where a prior art two-cycle tag RAM update scheme is shown. An update address is transmitted from system bus 100 to address register 110, which reads out from tag RAM 120. Within the same cycle, the tag is updated by update circuit (unclocked) 130, which is a well-known technique in the art. During the second cycle, the new tag is written back to tag RAM 120 through write register 140. As shown in FIG. 1, the minimum rate at which these memory transactions could take place is at one every two cycles, because it is not possible to do a read and write in one cycle with single-port RAMs. Also, within the two cycles, the modified data has to be computed in the same cycle that the tag is read or written. RAM access time, however, does not scale as logic does. Thus, while cycle time continues to decrease, RAM access time decreases at a lower rate. With the two-cycle implementation, a slower clock rate is typically required as more work has to be performed in the cycle. With the cycle time gradually approaching the RAM access time, it becomes nearly impossible and impractical to do the read of the tag and compute the next state in one cycle, and write in the following cycle. It is also equally impossible to do the read in one cycle, and then compute the next state and perform the write in the following cycle. Further, when two consecutive transactions are intended for the same address, a two cycle model risks having the second transaction access an invalid tag before the first transaction has written back its updated tag. As more multiprocessor systems are connected through packet-switched buses, the frequency of having consecutive updates toward the same tag address increases.
FIG. 2 illustrates a three-cycle tag RAM update scheme: one cycle for reading from address register 210 through tag RAM 220, one for modifying the tags from read register 250 through update circuit 230, and one cycle for writing the tags back to tag RAM 220 through write register 240. In contrast to its two-cycle counterpart, a three-cycle transaction can run at a faster clock rate, but at a cost of latency. As such, it becomes desirable to have a cache tag controller with high-throughput and yet with a low latency.