Typically, in multiprocessor systems, the processing units, main memory, peripheral devices and other devices are all tied together by a multidrop system bus which carries messages, instructions, data, addresses, protocol signals, etc. to and from all devices. In multiprocessor systems having large memory capacity, the main memory is constructed of inexpensive, slow memory components. Each access to main memory to obtain data requires use of the data and address portions of the system bus and a substantial amount of memory access time thus seriously inhibiting the performance of the computer system. It is found that, in computer systems having large main memory capacity, a substantial portion of the memory is used very infrequently. On the other hand, certain data blocks (i.e. memory locations) are used very frequently by the processing units (or processors). Because of this characteristic, many large memory capacity computer systems employ small volume, high-speed memory buffers or caches for storing copies of the frequently used data locally to the processing units. By placing the frequently used data in a faster, local memory cache, processing time and use of the system bus can be significantly reduced. When a processing unit needs to read data from an address, it accesses the high speed data cache. If the requested data address is not in the data cache, only then is an access to slower main memory over the system bus executed. To make the addition of a cache memory efficient, the access time of the cache memory should be about five to fifteen times faster than the access time of the main memory.
It is further found that the most recently accessed data is more likely to be accessed again by the processing units than data which has not been accessed recently.
The data cache is inserted between the processing unit and the system bus which couples between processing units, the main memory and all other devices on the system. Typically, some form of virtual addressing is used to address the space in the data cache. Tag bits are normally appended to the data in the cache in order to identify the actual physical address of the data in the cache. Various types of mapping functions are known in the prior art for addressing memory in the data cache and determining if it corresponds to the main memory location desired by the processing unit. The tag bits (or tags) are compared to some of the higher order bits of the requested address. If the tags indicate that the data is the desired data, the processing unit simply reads the data out of the cache. If, however, the tags do not match, then memory control hardware in the cache obtains the data from the main memory and replaces a location in the cache with the newly obtained data. The method of selecting the data which is to be replaced in the cache is called a replacement algorithm.
Normally, the processing unit simply issues normal read and write instructions using physical main memory addresses and the processing unit need not even know that a data cache is being used. Control circuitry in the cache determines if the requested data is in the cache.
The operation of the data cache during a read instruction from the processing unit is straightforward. If the requested data is in the cache, it is forwarded directly to the processing unit. If, however, the data is not in the cache, the cache obtains the data from main memory, stores it in the cache and also forwards it to the processing unit. In the case of a write to memory, there are various possible implementations. In one possible method, data can be written to the cache and the main memory simultaneously. However, a substantial amount of processor time is saved by only writing to the cache memory during the write operation and providing a flag for the location in the cache which indicates that the data in that location has been updated and is no longer consistent with the data in main memory. The main memory, then, can be updated only when this block of memory is to be removed from the cache to make way for a new block. This method saves unnecessary write operations to the main memory when a given cache word is updated a number of times before it is replaced in the cache.
Today there are numerous multiprocessor systems, i.e. computer systems having multiple processing units, available on the market. In multiprocessor systems using memory caches, each processor may have its own dedicated memory cache or they may all share a single memory cache. The use of data caches in these systems is more complex than that previously described due to the need to service multiple processors, and, if each processor has its own data cache, maintain consistency of data between multiple caches. Even further, some manufacturers now produce computer systems having two levels of cache memory: a small, very fast primary cache and a larger and frequently slightly slower secondary cache.
As the number of caches and/or the number of processors in a system increases, the memory access method and apparatus and the system bus protocol becomes extremely complicated. Some of the problems which must be addressed in designing the system bus and the associated protocol include; (1) providing means for maintaining consistency between the various caches, levels of caches and main memory, (2) providing an adequate mapping function, (3) developing an adequate replacement algorithm, (4) keeping the cost and size reasonable and (5) minimizing use of the system bus.
Various prior art approaches to the design of system buses and system bus protocols in multiprocessor systems are discussed in J. Archibald & J. Baer, Cache Coherence Protocols: Evaluation Using A Multiprocessor Simulation Model, ACM Transactions on Computer Systems, Vol., 4 No. 4 (November 1986).
It is an object of the present invention to provide a multiprocessor system having multiple levels of cache memory.
It is a further object of the present invention to minimize the use of the system bus for accessing memory locations in a multiprocessor system.
It is yet another object of the present invention to provide a method and apparatus for maintaining cache consistency in a multiprocessor system having multiple levels of cache memory.
It is one more object of the present invention to provide an efficient addressing protocol in a multiprocessor system having multiple levels of cache data storage.
It is yet another object of the present invention to provide decentralized cache consistency control so as to avoid catastrophic failure in the event of a failure at a single point.
It is another object of the present invention to provide a cache consistency scheme in which errors can be detected before the entire system is corrupted.
It is one more object of the present invention to provide a multiprocessor system with multiple data buffers wherein data may be read from a buffer and invalidated with a single instruction.
It is yet another object of the present invention to provide a novel system bus protocol scheme for a multiprocessor system having a plurality of memory caches.
It is a further object of the present invention to provide an improved system bus structure and method for a multiprocessor system.