This inventing relates to the structure and operation of the cache memories in a distributed data processing system.
In the prior art, a typical distributed data processing system consists of a single bus, a main memory module coupled to the bus, and multiple digital computers which are coupled to the bus through respective cache memories. One such system, for example, is the Pentium Pro system that was recently announced by Intel in which from one to four digital computers are coupled to a host bus through respective cache memories. See page 1 of Electronic Engineering Times, for Oct. 30, 1995.
Each cache memory in the above distributed data processing system operates faster than the main memory; and thus, the effect of the cache memories is that they provide a performance increase. But, each cache memory has a smaller storage capacity than the main memory; and thus, at any one time instant, each cache memory stores only a subset of all of the data words which are stored in the main memory.
In order to keep track of which data words are in a particular cache memory, each data word is stored in the cache memory with an accompanying compare address and tag bits. This compare address identifies the address of the corresponding data word in the main memory; and the tag bits identify the state of the stored data word. In the above Pentium pro system, there are four tag bits, E, S, M, and I.
Tag bit E is true when the corresponding data word is stored in just a single cache memory. Tag bit S is true when the corresponding data word is stored in more than one cache memory. Tag M is true when a corresponding data word has been modified by the respective computer to which the cache memory is coupled. And, tag bit I is true when the data word cannot be used.
Now, an inherent limitation which the above Pentium Pro data processing system has is that only a limited number of digital computers with their respective cache memories can be connected to the host bus. This limitation occurs because the physical length of the bus must be restricted in order to transfer signals on the bus at some predetermined speed. If the bus length is increased to accommodate more connections by additional digital computers and their respective cache memories, then the speed at which the bus operates must be decreased.
By comparison, in accordance with the present invention, a multi-level distributed data processing system is disclosed which has the following architecture: a single system bus with a main memory couple thereto; multiple high level cache memories, each of which has a first port coupled to the system bus and a second port coupled to a respective processor bus; and, each processor bus being coupled through respective low level cache memories to respective digital computers. With this multi-level distributed data processing system, each processor bus can be restricted in length and thus operate at a high speed; and at the same time, the maximum number of digital computers on each processor bus can equal maximum number of computers in the entire Pentium Pro system.
However, a problem which needs to be addressed in the above multi-level distributed data processing system is that each high level cache memory preferably should be able to respond quickly and simultaneously to two different READ commands, one of which occurs on a processor bus and the other of which occurs on the system bus. If the READ command on the processor bus is for a data word which is stored in the high level cache memory, then the high level cache memory preferably should present that data word on the processor bus quickly in order to enhance system performance. At the same time, if the READ command on the system bus is for a data word which is stored in both the main memory and the high level cache memory, then the high level cache memory also should respond quickly on the system bus with a control signal which indicates to the sender of the READ command that the data word is shared, as opposed to being exclusive. Likewise, if the READ command on the system bus is for a data word that is in the high level cache memory and which has there been modified by a digital computer on the processor bus, then the high level cache memory preferably should respond quickly on the system bus with a control signal which indicates to the sender of the READ command that the requested data word will be deferred. Then the high level cache memory can fetch the modified data word and send it on the system bus.
In the prior art, U.S. Pat. No. 5,513,335 describes a two port cache in which each port has its own set of compare addresses. Thus, this cache is able to make address comparisons quickly for two different READ commands which occur simultaneously on the two ports. However, during the execution of a READ command, the tag bits for the compare address at which the READ command occurs may have to be changed. And, if a READ command on one port causes the tag bits to change on the other port while those tag bits are being used by the other port, a race condition which causes errors will occur. Such a race occurs in the two port cache of U.S. Pat. No. 5,513,335.
Accordingly, a primary object of the invention is to provide a multi-level distributed data processing system in which the above problems are overcome.