1. Technical Field
The present invention relates in general to an improved method and system for data processing and in particular to an improved method and system for performing write-through stores in a multiprocessor data processing system. Still more particularly, the present invention relates to a method and system for maintaining cache coherency for write-through store operations in a multiprocessor system where the stores are of varying sizes.
2. Description of the Related Art
In a conventional symmetric multiprocessor (SMP) data processing system, all of the processors are generally identical, that is, the processors all utilize common instruction sets and communication protocols, have similar hardware architectures, and are generally provided with similar memory hierarchies. For example, a conventional SMP data processing system may comprise a system memory, a plurality of processing elements that each include a processor and one or more levels of cache memory and a system bus coupling the processing elements to each other and to the system memory. To obtain valid execution results in a SMP data processing system, it is important to maintain a coherent memory hierarchy, that is, to provide a single view of the contents of memory to all of the processors.
A coherent memory hierarchy is maintained through the use of a selected memory coherency protocol, such as the MESI protocol. In the MESI protocol, an indication of a coherency state is stored in association with each coherency granule (e.g. cache line or sector) of at least all upper level (cache) memories. Each coherency granule can have one of four states, modified (M), exclusive (E), shared (S), or invalid (I), which is indicated by two bits in the cache directory. The modified state indicates that a coherency granule is valid only in the cache storing the modified coherency granule and that the value of the modified coherency granule has not been written to system memory. When a coherency granule is indicated as exclusive, the coherency granule is resident in, of all caches, at that level of the memory hierarchy, only the cache having the coherency granule in the exclusive state. The data in the exclusive state is consistent with system memory, however. If a coherency granule is marked as shared in a cache directory, the coherency granule is resident in the associated cache and in at least one other cache at the same level of the memory hierarchy, all of the copies of the coherency granule being consistent with system memory. Finally, the invalid state indicates that the data and address tag associated with a coherency granule are both in invalid.
A write-through store updates data at each valid level in the cache hierarchy as well as main memory which corresponds to the address accessed. In particular, a write-through or store-through cache operates to provide a write operation to both the cache memory and the main memory during processor write operations, thus insuring consistency between the data in the cache memory and the main memory. To maintain cache coherency, a coherent write-through store must either flush valid matching cache lines on a processor or modify the data associated with a particular line to reflect the update caused by the write-through store. This ensures that subsequent loads from all processors obtain the newly updated data. Typically, a bus xe2x80x9csnoopingxe2x80x9d technique is utilized to initiate invalidation of cache lines.
Preferably, a write-through store can be valid data of varying sizes from one byte up to the largest data type for a particular processor. In certain caches, handling write-through stores of varying size poses problems. In particular some caches do not provide a capability to write valid data less than a particular size. Therefore, to write-through data in these caches a cache line must first be flushed and invalidated, utilizing processor cycles. Thereby, the write-through operation passes by the cache onto the next memory without being written. However, it is preferable to maintain data in the caches for reduced latency accessing. In addition, since flushing for each write-through store is inefficient, it is therefore desirable to provide for the write-through of varying sizes of valid data to caches which do not provide the capability to write valid data less than a particular width.
It is therefore one object of the present invention to provide an improved method and system for data processing.
It is therefore another object of the present invention to provide an improved method and system for performing write-through stores in a multiprocessor data processing system
It is yet another object of the present invention to provide an improved method and system for maintaining cache coherency for write-through store operations in a multiprocessor system where the stores are of varying sizes.
The foregoing objects are achieved as is now described. A method and system for performing write-through store operations of valid data of varying sizes in a data processing system are provided, where the data processing system includes multiple processors that are coupled to an interconnect through a memory hierarchy, where the memory hierarchy includes multiple levels of cache, where at least one lower level of cache of the multiple of levels of cache requires store operations of all valid data of at least a predetermined size. First, it is determined whether or not a write-through store operation is a cache hit in a higher level of cache of the multiple levels of cache. In response to a determination that a cache hit has occurred in the higher level of cache, the write-through store operation is merged with data read from the higher level of cache to provide a merged write-through operation of all valid data of at least the predetermined size to a lower level of cache. The merged write-through operation is performed in the lower level of cache, such that write-through operations of varying sizes to a lower level of cache which requires write operations of all valid data of at least a predetermined size are performed with data merged from a higher level of cache.