1. Field of the Invention
The invention relates to data processing systems and more particularly to multi-processor systems wherein each processor employs a high-speed buffer or private store in combination with a main storage device that is shared by all processors.
2. Prior Art
An article by C. J. Conti entitled "Concepts For Buffer Storage" published in the IEEE Computer Group News, March 1969, describes a hierarchical memory in which a large slow speed three dimensional core storage (backing store) operates in conjunction with a relatively small high-speed buffer storage (or cache) manufactured using integrated circuit technology. By using the buffer/backing store arrangement, the central processing unit (CPU) is able to access data at a high rate from the high-speed buffer which is matched more closely to the machine cycle of the CPU. When the CPU provides the address of desired information to the memory system, a control circuit determines whether or not the addressed data has been moved from the backing store to the buffer store. If the data is located in the buffer store, high speed access is possible from the buffer store to the CPU. If the data is not in the buffer store, controls move the data from the backing store to the high-speed buffer. A use algorithm is provided to insure that the most frequently used data is stored in the high-speed buffer. If the use algorithm is efficient, most accesses will be for data that is in the higher speed buffer store. This should result in a combined system having effective speeds approaching that of the fastest memory at a cost approaching that of the slowest and least expensive memory.
In the prior art, buffer/backing storage apparatus are transparent to the user and the buffer operation is under fixed hardware control. When a CPU initiates a fetch operation, the main storage address is presented to the memory hierarchy. Controls access a search mechanism or directory of the high-speed buffer to determine if the requested address currently resides in the high-speed buffer. If the requested information is in the buffer, it is immediately made available to the CPU. If the requested information is not currently in the buffer, a fetch operation is initiated to the main storage backing store. The buffer location to receive the information from main storage is determined by replacement logic which, in accordance with some predetermined algorithm, determines which data in the buffer store is to be replaced with new data unit from the backing store. When the fetch is initiated at the main storage, the exact word requested is first accessed and sent directly to the CPU and the buffer. The remaining words in the same "block" of data (as determined by the particular block size of the system) are also accessed from main storage and stored into the buffer.
There are currently three methods in the prior art for handling store operations. The "store through" method is used on most existing systems. In this method data is always stored immediately in the main storage and the buffer address mechanism is checked to determine if the addressed block is currently in the buffer. If the block is in the buffer, the data may also be stored in the buffer. On some systems, where I/O operations store only into main storage, the related buffer block is made invalid by the resetting of an associated "valid" bit so that any subsequent fetches to the same block will require accessing the main storage to fetch the data to the buffer.
A second method is called "store wherever." In this method, the buffer address mechanism is checked to determine if the address block is currently in the buffer. If the block is in the buffer, the data is stored directly into the buffer without further action. If the block is not in the buffer, the data is stored in the main storage.
The third method, "store-in buffer" or "store-in-cache", always stores into the buffer. If a store instruction addresses data that is not in the buffer, the system brings the block from main storage to the buffer and then stores the new data into the block in the buffer.
The above-mentioned Conti article discusses various techniques for organizing data and for accessing that data in the high-speed buffer. One such technique, which is preferred when using the present invention, is known as the "set associative" technique. An example of this technique can be found in U.S. Pat. No. 3,588,829, Ser. No. 776,858, filed Nov. 14, 1968 and which is assigned to the same assignee as this application. In this technique, the address information is broken down into books, pages and words. Depending on the memory size, there can be some predetermined number of books, having a predetermined number of pages, and each page containing a predetermined number of words. As an example, it might be determined that each book should contain 128 pages and that each page should contain some predetermined number of words. When the determination is made, it specifies that the high-speed buffer will have 128 storage sections, each section containing the number of words in a page. Associated with each of the 128 sections of high-speed storage will be a directory, or address index array, containing 128 registers. In the set associative technique of access, the corresponding page number from any of the predetermined number of books will always be found in the same storage section of the high-speed buffer. That is, page 10 from any book in the backing store will always be found in location 10 in the high-speed buffer. The associated register in the directory will be provided with an entry which identifies the particular book to which this particular page 10 belongs. The method of determining if requested data is in the high-speed buffer is to utilize the address bits specifying pages to access the directory, and simultaneously therewith, access the high-speed buffer. The entry in register 10 of the directory is compared with the applied address to determine whether or not the book value of the applied address matches the book value contained in the register. If they do compare, this indicates that the requested page 10 from the requested book is the data contained in the high-speed buffer. If the data is not from the requested book, the page 10 from the requested book is transferred from the backing store to the buffer and inserted in storage section 10 of the high-speed buffer and the identity of the requested book is then inserted in the associated register of the directory.
Another technique for which consideration has already been given in multi-processor systems, is known as the "fully associative" technique. In this technique, the high-speed buffer may be, for example, provided with 16 storage sections. Associated with each storage section will be a register. The size of each storage section may be sufficient for storing an entire book. The particular book stored in a particular one of the storage sections will be identified in the associated register. As each address is applied, the book address portion is compared with the entries in all of the registers and if a match is found, the data is specified as being in the section associated with that register. In the fully associative technique, data transferred from the backing store to the buffer store can be placed in any of the locations. When new data must be inserted, a replacement algorithm determines which of the sections should be replaced, new data is inserted in that section and the identity of the book is inserted in the associated register of the directory.
One prior art technique has been shown in a multi-processor environment which utilizes a fully associative configuration and the store-through concept. A storage protect memory, which is utilized to protect a predetermined fixed amount of data in the backing store, is provided with additional binary bits for reflecting which of the several processors has accessed data from the backing store to its associated private store. Whenever a processor stores data into the backing store, utilizing the store-through concept, the storage protect memory is interrogated and if it is determined that the data block is in another processor's private storage, the mechanism will be utilized to invalidate the data in the other processor requiring that processor to fetch the data from the backing store the next time it is utilized. This prior art technique is limited to a buffer storage configuration in which each storage section must contain the same amount of data as specified in the storage protect memory, and does not address itself to a set associative buffer configuration nor does it consider the problems arising when utilizing the store-in-buffer concept.
U.S. Pat. No. 3,735,360 describes a system wherein high-speed buffer operation is accomplished in a multi-processor system which includes a shared storage and a plurality of processing units. Each processing unit includes a private, high-speed buffer storage, an associated directory for providing an indication of the data transferred from the shared storage to the high-speed buffer, and a storage control unit which accepts signals from the associated processor, including the shared storage address of data to be operated on, and an accessing control signal which indicates that the data is to be fetched for transfer to the processor or that the processor is to store data into the operand location.
The system contains means interconnecting all the processors to perform an interlocking function. The interlocking function is accomplished by broadcasting, under certain specified conditions, address information to all other processors from a particular one of the processors in addition to the access control signal to indicate whether or not the operation is to be a fetch or a store. Utilizing the broadcasted address and access control signals, the storage control mechanism of all other processors is operated to determine further action in connection with the data requested by the particular processor.
The private high-speed buffers have a predetermined number of storage sections and an associated directory register for identifying the address of the shared storage data presently stored in the high-speed buffer. By providing various combinations of additional binary bits in the registers of the directory or index array, various forms of storage organization and access methods can be controlled by the broadcasted address and access control information to insure that each processor accesses the most current value of the operand identified in the shared storage.
If, for each directory register, only one additional control bit is provided which signifies the validity of the data in the associated storage section, each processor must broadcast address and access control information whenever data is to be stored by a particular processor. The directory of each other processor is searched to determine whether or not the same data is contained in the associated private storage. If so, the validity bit is reset to reflect that the data is no longer valid in the associated private storage. Another bit which can be provided in the directory registers is a bit called a fetch-only bit. This bit is set and reset to reflect whether or not the data in the particular one of the private storages is the only copy of the data stored in a private storage. That is, if a particular processor has fetched data from the shared storage into the private storage, and it is known that this is the only copy of the data in the private storages, the fetch-only bit will reflect this. The need for broadcasting the address and access control signals for a store operation would not exist.
Another binary bit which can be provided in the registers of the directory or index array, is a bit known as a store bit. This bit is set and reset to reflect a condition wherein the data in the high-speed buffer of a particular processor differs from the data in the shared storage. That is, when utilizing the store-in-buffer concept, all accesses to data by particular processors are made in the high-speed buffer including accesses for the storage of data. When the data has been transferred to the high-speed buffer of a particular processor, and that data is subsequently stored into the buffer, the store bit is set. Whenever a particular processor request requires transfer of new data from the shared storage to the high-speed buffer for either storing or fetching, the address and access control signals will be broadcast to the other processors. The address information of the requested data is utilized to search the directories of all other processors to determine whether or not the requested data resides in one of the other private high-speed storages and whether or not that data has been stored into. If the data has been stored into by another processor, that data must first be transferred back to the shared storage in its modified form so that the processor requesting the data will receive from the shared storage the most current value. This requirement is not necessary if the determination is made that the data in the other processor has not been stored into and therefore has the same values as the operands in the shared storge.
In a multi-processor system which utilizes the store-in-cache method, it is of great importance that fetch requests to the backing store be allowed only when the backing store contains an updated version of the requested data. In order to prevent a processor from obtaining "stale" data (i.e., data that may have been changed within the cache of another processor) fetch requests to backing store will not be serviced until the current version of the requested data is put into backing store. This may be accomplished through the use of a "spin lock" which causes the processor to periodically reissue its fetch requests until such time as appropriate system mechanisms indicate that the requested data has been updated in backing store. Typically, the original fetch request which resulted in system recognition that the requested data was resident in the cache of another processor would be used to initiate generation of a signal to the processor which owns the data, asking that processor to "cast out" the data and return the updated version to backing store. However, it is possible that two or more processors might, at substantially the same time, all be attempting to fetch the same data while that data is within the cache of yet another processor. The processor which owns the data will have received a signal telling it to cast out but, in attempting to store the updated data into the backing store, it must contend with the other processors that are trying to fetch the same data. This contention could result in a significant delay in updating the backing store, and could even conceivably result in a situation where storage contention permanently prevents backing store from being updated and processing ceases.
The primary object of this invention is to eliminate system deadlocks caused by contention for memory between a processor that is trying to update the backing store and two or more other processors that are trying to fetch data from the backing store.