The limiting performance factor in most multiple processor systems is processor bus bandwidth. Since most multiple processor systems use processor caches, a significant percentage of the processor bus bandwidth is consumed in performing snoops of these caches during I/O operations. These snoop operations have a negative effect on overall system performance since they require a significant portion of the processor bus bandwidth.
Snoops are used during I/O operation to determine, by means of a controller, if the most recent copy in memory of a data item also resides in a processor cache. Systems which snoop, do so on a processor cache-line basis. To snoop a 4K byte page, in a system which has a 16 byte cache line, requires 256 snoop cycles. In systems that snoop, it is also required that the I/O and processor buses be synchronized for each of the snoop cycles. Consider the following, regarding the mechanism referred to as synchronization. In the example of a computer system with two busses, i.e., the processor bus and the I/O bus, if there is no transfer of information from one bus to the other bus then these busses can run independently of each other. However, if and when information is to be transferred from one bus to the other bus, then a mechanism must be created to allow this transfer and this can be done through synchronization. Synchronization can be implemented a number of ways and the following are two which are typical.
In a first example, the busses are run in lock step. This allows the transfer of information to occur at anytime. There are a number of disadvantages to the lock step design. The key being that as processor busses become faster, due to improvement in processor technology, this improvement may not be implemented or taken advantage of because of the requirement that the processor bus be synchronized (run at the same speed or at a multiple (2.times., 3.times., etc.) speed) to the I/O bus. A second typical example is of a latch interface. This is a storage device that is placed between the two busses. When data is to be transferred, one bus places the data into the latch and signals the other bus. The other bus then can access the storage device and signals the first bus that it has accessed the information. This need for synchronization can detrimentally lengthen the time required for the snoops, thereby increasing the load that snooping places on the multiple processor bus bandwidth.
One possible way of eliminating the snoop cycles on the processor bus would be to use processor caches which are store-thru. Unfortunately any bandwidth saved by eliminating the snoops, would be more than lost by the increase write-to-memory traffic. Therefore such a solution is not practicable and is not readily useful for multiple processor designs.
Store-thru, also called write-through strategy, provides for all memory writes from the microprocessor to be passed along immediately by a cache controller to also update the main system memory. The result is that the main system memory always contains valid data. Any location in the cache can be overwritten, i.e., updated, immediately without data loss. Further discussion of other related cache operations, upon which the specification relies, can be found in a booklet entitled "Cache Tutoral" available from Intel Corporation, Literature Sales, Mt. Prospect, Ill. The booklet is dated 1991 and the order number is 296543-002.
It would be advantageous to provide for snooping in a more efficient manner particularly for multiple processor systems without hindering access to the bus and unduly limiting bus bandwidth.