As seen in FIG. 5 there is shown the basic diagram of a central processing module 5 having a processor 10 which interfaces with a system memory module 40 by means of a system bus 22. Other system modules 30 may share the system bus and these may include not only additional central processing modules, but also input/output processors and a maintenance subsystem. Additionally there can be a second identical system bus 22, which is used to increase the system bandpass as shown in the dotted bus lines of FIG. 5. This type of system is typical to that of the Unisys A-11 computer system described in the aforementioned U.S. Ser. No. 08/019,003 entitled "Synchronous Dual Bus System for Store-Through and Non-Store-Through Cache Memories in Multi Processor Networks".
Each of the features to be described herein would apply equally to a single bus system or a dual bus system. In a dual bus system, the accessing modules described herein would be able to access either bus as that bus is available. However, for simplicity of discussion, the following disclosure will view the situation of a single bus situation.
Typical of the Unisys A-11 system is a "E-Mode" designated protocol which transfers data on the basis of blocks of four related data words. For example, each word may be a word of 60-bits wherein 52 bits are information data, 7-bits are parity data and 1-bit indicates data corruption.
In a typical situation with a mix of typical system operating software, the processor 10 of FIG. 5 may, on the average, function to require memory accesses about every five clock times of processor operations, that is to say, the processor will execute processing operations for five clocks, then access memory, then process operations for five clocks and then access memory on an average basis. During the "memory access" times, the processor 10 must, of course, wait until the memory function in the main system memory 40 has provided the next data word or the next instruction code word before the processor can proceed with continuing its processing operations.
One recurring problem in digital system design is how to speed up throughput and reduce delaying processor access to memory data and instructions. FIG. 6 is a drawing which indicates the "processor/access cycle" time which is designated as T.sub.ap. This time period T.sub.ap is made up of the processing time, t.sub.p, plus the access time, t.sub.a.
The performance of the system, of course, is dependent on the best speed of access to memory data and thus is an "inverse function" of the processor/access time, T.sub.ap. Thus if this time period, or either of its sub-elements t.sub.p, or t.sub.a can be reduced, then the system performance can be increased accordingly.
The presently described auxiliary mini-cache module provides a throughput enhancement system with methods and architecture for reducing the average memory access time, t.sub.a. It may be noted that the access time may involve several elements which include:
(i) the time for bus arbitration and access grant required; PA1 (ii) the system bus protocols used; and, PA1 (iii) the memory module 40 Read cycle time.
The memory module Read-cycle time, item (iii), is the most significant element and is the focus of the presently described system.
One technique to reduce the memory cycle time is the standard technique of using a general cache memory module 14 as seen in FIG. 4. Thus, as seen in FIG. 4, there is provided the insertion of a general cache memory module 14 between the central processing module with its processor 10 and the system bus 22 which provides a channel to and from the main memory 40.
A cache memory module such as the general cache 14 of FIG. 4 is made up of a very high speed fast memory cycle, fast data access cycle of operation but it also involves very expensive types of storage units such as RAMs.
The size of the cache units such as cache memory 14 is generally much smaller in addressability than the main system memory 40. However, since processing most often is sequential or repetitive in nature, the algorithms for cache designs have already been derived for filling the cache memory with data words that the processor 10 is most "likely" to need on its next operation or within the next few operations.
The "access time" between the processor 10 and the cache memory 14 is much faster than that which would be required to access main memory. Often, cache cycles are as quick as a single clock time period.
By using the appropriate cache hardware design algorithms, appropriate cache structure and cache size, the average "cache hit" rates may be as high as 80% to 90%, that is to say, that 80% to 90% of the time, the cache memory 14 already contains the data word which is needed by the processor 10.
FIG. 4 shows a generalized system where a central processing module 5 having a processor 10 communicates through an internal bus 12 to a general cache memory 14 which provides communication on a system bus 22 to the main system memory 40. This is a typically used system in order to enhance the data access time for the processor which will enhance throughput of the system.
FIG. 5 shows a generalized system diagram where a central processing module 5 having a processor 10 can communicate over a system bus 22 to the main system memory 40. Additionally, other system modules such as input/output modules, other central processing modules, and other digital units designated by the block 30 may also communicate over the system bus to access main memory. Thus at times, there will be contention for main memory between the processor 10 and the other system modules 30. Further, if one of the other system modules 30 is writing data on the system bus into main memory, then it is possible that the data in a cache memory unit could be invalid and not usable by the processor 10 in certain situations. Thus the presently described mini-cache has an Invalidation Block unit which prevents the processor from accessing data in the mini-cache when it is discovered that the system bus is writing new data into the main system memory 40. This will later be described in connection with the mini-cache of FIG. 1.
While better system performance value is seen from the addition of a cache memory module such as cache memory 14, it should be indicated that this gain comes at a considerable system cost in money and hardware and PC board real estate. Thus it is possible also to eliminate the general cache and substitute a less costly mini-cache in certain instances.
The presently described mini-cache module enhances the processor function to provide a considerable improvement in access time and throughput by use of the described mini-cache alone or the mini-cache in combination with a general cache memory unit.