1. Field of the Invention
The present invention relates to multiprocessor computer systems using an in-order multiprocessing bus, and in particular, to multiprocessing systems including central processing units (CPUs) with write-back cache memories having snoop capability.
2. Description of the Related Art
Multiprocessor computer systems using in-order protocol multiprocessor buses are well known in the art. In such systems, multiple central processing units (CPUs) communicate with one another and other devices, such as memory units, via a 32-bit or 64-bit communication bus. A first CPU module can issue a read data request onto the bus and, while the first CPU module is waiting for a response from the addressed memory unit, a second CPU module can issue a second read data request before a response is received from the memory unit accessed by the first CPU module. However, the in-order protocol requires that the first CPU module receive a reply from the accessed memory unit before the second CPU module receives a reply from the memory unit which is accessed by the second CPU module. Furthermore, in-order protocol busses are characterized in that a new address cycle is able to begin immediately after the last address cycle is complete.
In such multiprocessor systems, each of the CPU modules typically includes a cache memory. As is well known in the art, a cache memory is a fast memory storage unit which is under the control of the CPU within each of the CPU modules. A cache memory may be either a write-through cache memory or a write-back (sometimes called copy back) cache memory.
A write-through cache memory always immediately writes data changes within memory locations of the cache to corresponding memory locations within the main memory module. Thus, for example, if the central processing unit within the first CPU module writes new data into the cache memory within the first CPU module at a given address, the cache memory will immediately write the same data into the corresponding address in the main memory module via the multiprocessor bus. It should be noted here that the main memory module will typically have a portion of memory which has a mapped one-to-one correspondence to memory locations within the cache memories of each of the CPU modules. The memory locations within the main memory module that can have corresponding cache memory locations are typically called cacheable memory addresses. At any time, a portion of the cacheable memory addresses are mapped into the cache memories. The mapped portions typically change over time as the CPU modules request data from different portions of the main memory. When changes occur, the data in the cache memories are swapped out and replaced with data from the main memory. It is important that the corresponding memory locations within each of the cache memories and the main memory unit contain the same data because when an input/output device wishes to access a memory location, the memory location will typically be accessed within the main memory unit. However, if the main memory unit does not have the same data as the cache memory of a corresponding address, this indicates that the main memory unit has yet to be updated with the most recent data which is present within the address sought by the input/output device. Thus, erroneous data may be retrieved by the input/output device if the main memory unit does not have the same data as the corresponding cache memory address.
Although write-through cache memories guarantee that corresponding memory locations within the cache memories and the main memory module have the same data, the necessity of immediately accessing the multiprocessor bus each time the CPU within a CPU module writes to the cache memory causes a number of interruptions on the multiprocessor bus. These interruptions often create bus inefficiencies which may severely compromise the performance of the overall multiprocessor system.
To overcome the difficulties associated with write-through cache memories, write-back cache memories have been used. Write-back cache memories write data from the cache memory to a corresponding memory location within the main memory unit at a later time when the data is requested by another device (such as an input/output device or another CPU module). Of course, because there is a possibility that an input/output unit will address a memory location within the main memory unit that has not yet been updated by the cache memory, systems which employ write-back cache memories typically include a snoop feature. The snoop feature determines if a cacheable memory location within the main memory unit has the same data written as the corresponding memory location within a cache memory of one of the CPU modules.
Within a typical snoop equipped system, an extra bit is appended to the end of the address field within a cache memory when data at that address has been modified by the local CPU and has not yet been transferred to the corresponding memory location within the main memory unit. This extra bit, commonly referred to as a "dirty" bit, allows a memory controller within the main memory unit to determine if data at a given address has been modified by the local CPU. Thus, if an input/output unit is accessing a memory location within the main memory unit that has a corresponding memory location within the cache, and the cache memory location includes a dirty bit, then the memory controller will cause the cache memory having the modified data at the requested address to immediately write the data to the memory within the main memory unit. In this manner, the input/output device can access the desired memory address and be ensured that the data within the memory address is current.
Typically, multiprocessor systems which are highly concerned with bus efficiency include a write-back buffer within the main memory unit. The write-back buffer is used to quickly latch a sequence of data blocks from the multiprocessor bus. Typically, the memory controller within the main memory unit individually retrieves data from the multiprocessor bus and stores this data within the dynamic random access memory (DRAM) in the main memory unit. However, if the internal buffer within the memory controller already contains data, this process is generally quite time consuming. By providing a write-back buffer within the main memory module which is capable of quickly latching data from the multiprocessor bus, the CPU modules on the multiprocessor bus do not have to wait for the memory controller.
As is well known in the art, memory transactions for cacheable memory addresses in the INTEL.RTM. Pentium P5 microprocessor provide four blocks of 64-bit data on the multiprocessor bus in consecutive clock cycles. These transfers are referred to as bursts, and are sometimes referred to as INTEL bursts. By sending data in this burst fashion, a high bus efficiency is achieved. However, some memory controllers, such as may typically be implemented within the main memory unit, would not be sufficiently fast to latch each of the sequentially output data blocks and transfer these into the DRAM. Further, even the faster memory controllers would be unable to quickly latch data from the system bus if the internal buffer of the memory controller is full. Thus, if a CPU module outputting data had to wait for a ready signal from the main memory unit each time a new data block was output to the memory unit, then a transfer of four blocks of memory might require a time delay of several clock cycles between each data block.
By providing a write-back buffer, the main memory unit can simply assert a memory ready signal for four consecutive clock cycles so that each of the four data blocks are immediately latched into the write-back buffer. The write-back buffer then writes the latched data to the memory controller within the main memory module at a rate sufficiently slow to ensure that a complete data transmission is made from the write-back buffer to the DRAM via the memory controller.
In multiprocessor systems including write-back buffers, snoop circuitry typically includes a way to first determine whether there is data within a write-back buffer having the address specified by the input/output device. If it is determined that there is no data within the write-back buffer at the same address as the data requested by the input/output device, then the memory control within the main memory unit snoops the cache memories within each of the CPU modules. If it is determined that one of the cache memories within the CPU modules has new data at the memory address accessed by the input/output unit, then this data must be written immediately to the DRAM within the main memory unit.
If the write-back buffer already includes data at some different address than that accessed by the input/output unit, then the data from the cache memory simply overwrites the data within the write-back buffer. Thus, the data in the write-back buffer is lost during the write-back sequence of the snooped cache address. This phenomena is commonly referred to as deadlock and may compromise the performance of the multiprocessor system.