1. Technical Field
The present invention relates in general to a method of maintaining cache coherence and in particular to a method of maintaining cache coherence in a computer system having unordered communication. Still more particularly, the present invention relates to a method of maintaining cache coherence in a multiprocessor system having unordered communication.
2. Description of the Related Art
A wide variety of different interconnect structures are used to couple processors, memory, and input/output (I/O) in modern computer systems. For example, these interconnects include buses, switches and interlocking rings. Ordered transport describes an interconnect's ability to deliver encapsulated information units, such as messages, packets, or bus cycles, in the same order in which they are enqueued or programmed between a distinct sender unit on the interconnect and a distinct receiver unit on the interconnect. For example, if an interconnect maintains ordered transport, transmission of packet A followed by packet B from the same memory device leads to receipt of packet A prior to receipt of packet B at the destination. If an interconnect does not insure ordered transport, packet B may either be received before or after packet A. For example, an interconnect may provide multiple pathways between the same source and destination. If more than one pathway is active at the same time and equivalent pathways have different transmission latencies, the delivery order may differ from transmission order. If an ordered transport methodology is not imposed on the interconnect, the interconnect may often be designed in a way which reduces latency, increases bandwidth and increases reliability.
While an interconnect characterized by such unordered transport may have lower latency, increased bandwidth, or increased reliability, potential for reordered transport greatly complicates the support hardware and software for coherent shared memory across the interconnect. The problem becomes particularly complicated when cache memory is utilized in a multiprocessor system. Caches in multiprocessors must operate in concert with each other. Specifically, any data that can be updated simultaneously by two or more processors must be treated in a special way so that its value can be updated successfully regardless of the instantaneous location of the most recent version of the datum. This is normally accomplished by assigning ownership of the datum to the memory device (cache or shared memory) which currently holds its present value and only allowing the present owner of the datum to change its value.
The problem of noncoherence in shared memory resulting from unordered transport can arise in various ways. For example, a memory device or memory unit may receive a request directing that a memory coherence operation such as purge or invalidate be performed on a specific interval or block of memory. The state of this block, as known to the memory device or memory, may be unknown as a result of other information packets traveling through the interconnect or awaiting service in another memory device or memory. Response to such request while the memory block is in an unknown state can lead to violations of the computer's consistent shared memory architecture.
As a more particular example, a memory control unit may direct that a cache return ownership of a memory block to the memory control unit. The cache notes that it has a store-thru operation outstanding for this block. The cache does not know if the store has updated the data value recorded by the memory controller because it has not yet received a confirmation. Thus, the state of the memory block is unknown to the cache on receipt of the memory control unit's purge request. If the cache were to return ownership, and the return packet overtakes the store-thru packet within the interconnect because of the unordered transport, the memory control unit may reissue ownership of the memory block to another cache along with an outdated value of memory.
The problem of noncoherence created by the potential of unordered interconnect transport is commonly corrected in one of three ways. First, the system architecture may require that the interconnect provide ordered transport. This solution is inefficient due to the decreased bandwidth and higher latency introduced. Second, sequence numbers may be assigned to packets transported through the interconnect which allow the unordered packets to be reassembled in order at the end points of their transports. Such a scheme requires substantial buffering of the packets at the end points, thereby increasing costs and complicating the design. Third, ownership may be retained at the memory control unit to prevent stale copies from being retained by a cache, or a memory device from reading a noncurrent value in shared memory. The problem with this scheme is that it severely reduces performance by virtually eliminating the advantages of cache memory systems.
The present invention provides a unique solution to the problem of noncoherence caused by unordered interconnect transport. This solution eliminates requirements for transport order on the interconnect, thus, retaining the advantages of unordered transport. It eliminates the need for buffering of packets at end points on the interconnect, thus, reducing cost. And it eliminates the need for fixed memory control ownership of memory blocks, thus, retaining the performance advantages of read/write caching. Moreover, the present invention achieves this at little additional cost or complexity.