1. Field of the Invention
This invention relates generally to data processing systems and, more particularly, to data processing systems in which the central processing unit includes a cache memory unit. The present invention permits a write operation into the cache memory unit within the limits of a shortened system clock cycle.
2. Description of the Related Art
Referring to FIG. 1, a typical data processing system configuration is shown. The data processing system includes at least one central processing unit 10 (and 11), at least one input/output unit 13 (and 14), a main memory unit 15 and a system bus 19 coupling the plurality of units of the data processing system. The central processing unit processes groups of logic signals according to software and/or firmware instructions. The logic signal groups to be processed are typically stored in the main memory unit 15. A console unit 12, which can be coupled to the central processing unit(s) 10 (and 11), can include the apparatus and stored instructions to initialize the data processing system and can act as a terminal during the operation of the data processing system. The input/output unit(s) 10 (and 11) can provide an interface for the exchange of logic signal groups between the data processing system and terminal units, mass storage units, communication units, and any other units to be coupled to the data processing system.
Although the system shown in FIG. 1 can execute the procedures determined by the system programs, this system suffers from the separation of the main memory unit 15 and the central processing unit(s) 10 (and 11). This separation causes the logic signal groups required by the central processing unit(s) 10 (and 11) to be delayed in the transfer, thereby resulting in a negative impact on the system performance. In addition, the size of the main memory unit 15 required by programs typically used by the data processing system generally causes the main memory unit 15 to be implemented in a slower technology (i.e. for reasons of cost) and the consequential detrimental impact on performance can result even when the main memory unit 15 is closely associated with the central processing unit.
The solution typically used to resolve the conflict resulting from the need for a large memory unit and the need for rapid access to logic signal groups at reasonable cost is the use of the cache or buffer memory unit associated with each central processing unit. Referring to FIG. 2, the central processing unit 10 includes a cache memory unit 24 associated with the processing components of the central processing unit 10. The processing components include an instruction subunit 21 and an execution subunit 23. Also included in FIG. 2 is a control unit 22. The control unit 22 can be employed advantageously when the execution of an instruction by the central processing unit 10 is divided into a plurality of instruction segments permitting overlapping execution of instructions in a technique typically referred to as "pipelining" the execution of the instruction sequence. The advantage of this technique is that, even though a segmented instruction can take a longer time for its execution, consecutive instructions can be initiated after a period of time equal to the time assigned for execution of each instruction segment. Therefore the processing speed of the central processing unit 10 can be increased by assuming the penalty of increased complexity in the central processing unit 10. However, the time interval required for execution of each instruction segment must be chosen to accommodate the instruction segment requiring the longest time for execution. Because the cache memory unit 24 is a part of the central processing unit 10, the operation of this unit must be completed within the allotted time or else the time interval must be lengthened. The cache memory unit 24 serves as an intermediate storage facility (between the main memory unit 15 and the execution portions of the central processing unit 10). The cache memory unit 24 stores the logic signal groups of most immediate importance to the execution portions of the central processing unit 10 to avoid the delays incurred in retrieving these logic signal groups from the main memory unit 15.
Referring next to FIG. 3a, a typical implementation of a cache memory unit 24 according to the related art is shown. Groups of logic signals representing data to be manipulated by the central processing unit 10 are applied to data-in storage unit 31. Mask signals can also be applied to data-in storage unit 31, the mask signals identifying a selected portion of an associated data logic signal group. Groups of logic signals representing addresses of associated data signal groups are applied to address-in storage unit 32. Storage units 31 and 32 can be implemented by latch-type circuits, flip-flop type circuits, register circuits, portions of another circuit or any circuit that can provide a buffering signal storage function for the remainder of the cache memory unit 24. A first portion of the output signals from the address-in storage unit 32 are applied to the address-in terminals of the tag storage unit 33 and to the address-in terminals of the data storage unit 34. The second group of output signals from the address-in storage unit 32 are applied to the data-in terminals of the tag storage unit 33. The output signals from data-in storage unit 31 are applied to the data-in terminals of the data storage unit 34. The tag storage unit 33 and the data storage unit 34 are comprised of groups of storage cells, the number and electrical coupling of the storage cell groups permitting the first signal group of address signals from address-in storage unit 32 to address the designated storage cell group in unit 33 and 34. The number of storage cells in each group in the tag storage unit 33 must be sufficient to accommodate the second portion of the address signal group (less any address signals accommodated by storing a plurality of data signal groups at a given address) plus any status logic signals that can be associated with each data signal group to be stored. The number of storage cells in each group in the data storage unit 34 must be sufficient to store the number of logic signals associated with each address. The output terminals of the data storage unit 34 are coupled to the data-out storage unit 36. The output terminals of the tag storage unit 33 and the output signals from the second group of address signals in the address-in storage unit 32 are coupled to comparator unit 35. Referring now to FIG. 3b, the division of an address signal groups is illustrated. The first portion of the address signal group is the index address field and addresses associated storage cell groups in both the tag storage unit 33 and the data storage unit 34. The second portion, referred to as the tag or comparison address field in FIG. 3b, of the address signal group is typically the remainder of the address associated with a data signal group and is stored in the tag storage unit 33 at the same index address as the data signal group associated with the complete address. The dashed line cells at the end of the address signal group in FIG. 3b illustrate that when a storage cell group in the data storage unit 33 stores a plurality of the smallest addressable signal groups, then the retention of the least significant address bits is redundant. It will also be clear to those familiar with the functioning of cache memory units that the index field in the address signal group need not be the least significant address bits in the address signal group as shown in FIG. 3b, but can be selected to implement any of a number of data signal group storage strategies.
A "read" operation, retrieving information from the cache memory unit 24, can be understood as follows. The address signal group, associated with the data signal group selected for retrieval, is entered in the address-in storage unit 32. The index address field of the address signal group is applied to the address-in terminals of the tag storage 33 and to the address-in terminals of the data storage unit 34. Because the "read" signal is applied to the storage unit 34, the group of signals stored in the location addressed by the index address field, i.e. the comparison address field, is entered in the comparator unit 35, while the group of signals stored at the index address of the data storage unit 34, i.e. the data signal group associated with the index plus comparison address fields, is entered in data-out storage unit 36. Simultaneously, the tag or comparison address field of the address signal group stored in the address-in storage unit 32 is entered in the comparator unit 35 and compared with the comparison address field retrieved from the tag storage unit 33. When the comparison is positive, then the data signal group in the data-out storage unit 36 is the selected signal group. This result is communicated to the data-out storage unit 36 by means of a signal generally referred to as a "hit" signal. In order to minimize the effect of the extra time required to perform the comparison, the data signal group can be transferred to other apparatus and the "hit" signal (or the absence of a "hit" signal) can be used to control the transfer of the selected data signal group at a different location of the central processing unit 10. When the comparison is negative, i.e. the selected data signal group is not in the data storage unit 34 (and thus consequently not in data-out storage unit 36), then the selected data signal group must be retrieved from the main memory unit 15.
A "write" operation, in which a data signal group is stored in the cache memory unit 24, is implemented as follows. An address signal group is entered in address-in storage unit 32, while an associated data signal group is entered in the data-in storage unit 31. The index portion of the address signal group is applied to the address terminals of the tag storage unit 33 and the comparison portion of the address signal group is entered in comparison portion of the address signal group is entered in comparison unit 35. The tag address portion stored in the storage cell group of tag storage unit 33, identified by the index portion, is compared in comparison unit 35 with the comparison portion of the address signal group that was entered in the comparison unit 35. If the result of this comparison is that the tag address groups are the same, then the "hit" signal activates the write terminal to which a "write" signal has been applied of the data storage unit 34 and the data signal group stored in the data-in storage unit 31 is entered in the data storage unit 34 at the location defined by the index signal group stored in the address-in storage unit 32. If a "hit" signal is not generated, then the data signal group is stored in the main memory unit 15 (see FIG. 1) at the address specified by the associated address signal group in the main memory. It will be clear that the write operation takes a substantially longer time to perform than the read operation because the storage of the data signal group can take place only after the comparison of the tag signal groups has been completed. An attempt to increase the speed of the operation of the cache memory unit 24 in a central processing unit 10 having a control unit 22, such as is shown in FIG. 2, is limited by the sequential nature of the write operation.
A need has therefore been felt for apparatus and method for a cache memory unit capable of operation consistent with the system clock cycles of a central processing unit having a segmented or pipelined execution of an instruction sequence.