In data processing systems where a number of processors share a main memory, each processor is often provided with its own cache memory. Such cache memories improve processor speed and also reduce traffic load on main memory. The fetch traffic created by the processors is largely handled by their individual caches. The store traffic is handled generally in either of two ways. One way is to direct all stores to main memory more or less immediately, so that main memory stays current on all changes. This is a "store-through" cache. The other method is to direct the stores to cache, without immediately updating main memory. The update is done only when a changed cache line is replaced. This type of cache operation is called a "store-in" cache.
In a store-through cache, when a processor encounters a store operation, the data to be stored is sent directly to main memory and also to the processor's cache. If the address line in cache into which the data is being stored is already in the cache, that line is updated with the new data. If the line is not in the cache, no cache action occurs. To maintain cache consistency, all other caches must invalidate the modified line and the storing processor may not fetch from the modified portion of the line until this invalidation operation has been accomplished. In this way, main memory is always updated and no cache contains a line of data that is not updated. The main disadvantage arising from the use of a store-through cache is the amount of store traffic that the main memory must handle. This can create serious traffic bottlenecks when there are a number of processors attempting to access main memory.
A store-in cache has the advantage that it limits the store traffic to main memory. However, with such a cache configuration, it is necessary to insure that a given cache line is being changed by only one processor at a time. This is accomplished by assuring that the line is held exclusively by the processor that is going to change it. If the line is in any other cache, it is invalidated. Subsequently, if another processor wishes to store data at the same line address in its own cache, the processor must obtain the changed line from the first processor, hold it on an exclusive basis, and invalidate the line address in all other caches. This transferring of a line with exclusive status from one processor to another can affect performance if a line is being frequently changed by several processors, since moving the line back and forth can be quite time consuming. This operational situation may be prolonged since a line that is actively being fetched from will tend to stay in the cache and will retain exclusive status indefinitely.
A further description of store-in and store-through cache memories may be found in a tutorial article entitled "Cache Memories", A. J. Smith, Computing Surveys, Volume 14, No.: 3, pages 473-530, September 1982.
Current high performance multi-processors (have up to six processors) use store-in caches despite their disadvantages. They do this because the store traffic of store-through caches would overload the access capability of main memory.
Some processor designs provide a "store stack" which allows store instructions to be signaled as completed when the store data is placed in the stack, rather than only after the store data goes to a cache. The stack holds the result of only one store instruction per entry and entries from the stack are stored in the cache without further modification or processing. No processor can access data in the stack as it is accessible only after it has been stored in the cache. Stores that are pending in the stack must be monitored to insure that no subsequent data or instruction fetch would have been affected by a pending store. The store stack is only a buffer that eases the timing considerations between the pipeline and the cache.
While the prior art is replete with many references to cache memories, the following are representative. In U.S. Pat. No. 4,167,782 to Joyce et al, a data processing system is disclosed which includes a plurality of system units, all connected in common to a system bus (including a main memory and a high speed buffer or cache store). The cache store monitors each communication between system units to determine if it is a communication from a system unit to main memory which will cause the updating of a word location in main memory. If that word location is also stored in the cache, then the word location in cache is updated in addition to the word location in main memory.
In U.S. Pat. No. 4,195,340 to Joyce, a first in, first activity queue for a cache store is described. This consists of a buffer memory that receives all information transferred over a system bus. The system bus connects main memory to a processor, its cache and an input/output multiplexor. The buffer memory has two related uses. The first use is to keep the cache current with main memory. In this use, when main memory is being updated, i.e. when the processor is storing to main memory, the updated information received by the buffer is used to update the cache. This is done only if the cache contains the item in main memory that is being updated. Otherwise, if the item is not in the cache, there is no need to update the cache and the information in the buffer is discarded.
The other use is to assist in the handling of a cache miss for which it is necessary to obtain the missed information from the main memory and bring it into the cache. The information from main memory is written into the buffer and then it is written into the cache. Also depending on the way that main memory is configured, additional information is brought into the cache which adjoins the data being accessed, anticipating that such data may be later needed.
U.S. Pat. No. 4,415,970 to Swenson which is controllable by a number of processors or storage control units, rather than being controlled by one such unit. A cache memory is provided which stores a command queue (or queues), all of which are accessible to a plurality of storage control units. A storage control unit which is free is then able to access that queue to provide a command to the input/output device.
None of the prior art known to the inventors hereof, enables a multiprocessor system to employ store-through caches while simultaneously keeping store traffic in main memory at a reduced level which is commonly associated with store-in caches.
Accordingly, it is an object of this invention to provide a queue mechanism which enables a multiprocessor system to employ store-through caches.
It is a further object of this invention to provide a queue mechanism which enables reduced traffic to main memory where the multiprocessor system employs store-in caches.