The present invention relates to a method of fast referring to a cache memory among a plurality of processors, and a data processing system using this method. In particular, the present invention relates to a cache memory reference control system and a data processing system which are suitable for such a system that a plurality of processors share a memory system.
In a data processor including a plurality of processors each having a cache memory, it is necessary to conduct cache coherency control so that contents of cache memories will not incur a contradiction among processors.
As heretofore well known, cache coherency control methods are broadly classified into store-through (referred to as write-through as well) protocols and store-in (referred to as write-back or copy-back as well) protocols.
The store-through protocol is such a method that when a processor has written data into a cache memory in a processor a lower order hierarchy memory (such as a main storage) is also updated.
The store-in protocol is such a method that even in the case where data has been written into a cache memory in a processor such control as not to update the lower order hierarchy memory is conducted until an access request from another processor occurs.
The store-through protocol is simpler in control and easier in implementation than the store-in protocol. In the store-through protocol, however, the frequency of access to the lower order hierarchy memory increases. If the number of processors is increased, therefore, performance degradation caused by contention poses a problem.
In the case of the store-in protocol, the control becomes complicated, but the frequency of write access to the lower order hierarchy memory can be decreased. If the store path throughputs are the same, therefore, the store-in protocol becomes higher in performance than the store-through protocol.
The control complication of the store-in protocol is caused by the coherency control conducted when writing data into a cache. In the case where the coherency control is not required, therefore, the store-in control becomes equal in complication to the store-through control.
Therefore, the following system has been contrived. In this system, a memory space is divided into a common region shared by a plurality of processors and dedicated regions respectively dedicated to single processors. In the common region, cache coherency control is required. In the dedicated regions, cache coherency control is not required. For the common region, the store-through protocol is used. For the dedicated regions, the store-in protocol is used.
For example, in a cache coherency control scheme disclosed in JP-A-1-26643, accessed regions are divided into a common region accessed in common from all processors and individual regions accessed individually from respective processors. For the common region, the write-through (store-through) system is used. For the individual regions, the write-back (store-in) system is used.
As other such relating processors, those described in U.S. Pat. No. 5,140,681, JP-A-2-226449, U.S. Pat. No. 5,353,428, JP-A-4-123151, JP-A-4-296950, JP-A-5-73416, and U.S. Pat. No. 5,572,667 can be mentioned.
Apart from the classification of the store-in protocol and store-through protocol, cache coherency control can also be classified into two types: invalidation type protocol and update type protocol.
When writing has been conducted on a certain cache memory, there is a copy of the same data as data subjected to the alteration in another cache memory as well in some cases. In such a case, the invalidation type protocol dissolves contradiction by invalidating copy data.
When the above described cache coherency control has become necessary, the update type protocol conducts control so as to dissolve contradiction by updating the copy with altered data.
The invalidation type protocol is simpler in control and easier in implementation.
The performance depends on a memory access pattern and so on. It cannot be said unconditionally which protocol is advantageous. For bringing contents of cache memories of processors into the same status, however, the update type protocol can conduct it faster because the update type protocol does not pass through a lower order hierarchy memory.
In general, the invalidation type protocol is adopted in many cases. However, such a system that both protocols are mixedly present has also been contrived.
For example, in a system disclosed in JP-A-8-95862, a memory space is divided into a high-reliability area and a normal area. Control is conducted by using the update type protocol for the high-reliability area and the invalidation type protocol for the normal area.
In the conventional technique in which switchover of store-in protocol over to the store-through protocol and vice versa are conducted, specification of the common region and the dedicated regions serving as a criterion of the switchover is based on a memory address range or discrimination of processors. Division to the two regions is fixed.
Furthermore, in the conventional technique in which switchover of the invalidation type protocol to the update type protocol and vice versa are conducted, the switchover is based on a fixed regional partitioning that is in turn based on an address range.
In switchover of the cache coherency control protocol based on these fixed divisions, a sufficient function cannot be demonstrated, if regional partitioning cannot be determined until a program is executed.
For example, when it cannot be determined by only static analysis before program execution whether certain data corresponds to a dedicated region or a common region, the data might be actually data that is used for dedicated use and that should be processed with the store-in protocol. Even in this case, the data cannot help being handled as common data with the store-through protocol. As a result, performance degradation occurs.
Furthermore, it is actually difficult to conduct division statically for such an existing program that definite partition into dedicated regions and the common region is not conducted.
In the conventional technique, regional division serving as the criterion of switchover of cache coherency control is conducted fixedly. This results in a problem that it is impossible to sufficiently cope with such an existing program that a dynamic behavior of a program and the above described regional division are disregarded.
In the conventional technique, the invalidation type protocol is widely used because it can be implemented easily. In the case where the invalidation type protocol is applied to the common region, however, performance degradation due to mutual invalidation appears remarkably. This is another problem.
This problem is caused by the following phenomenon. A plurality of processors conduct writing into the common region. As a result, data invalidation is conducted mutually. Resultant frequent access to the lower order hierarchy memory causes the above described performance degradation.
In the conventional technique, such a reference form to the common region that access to the same region is repeated is disregarded, and the invalidation type protocol is used for both the common region and the dedicated regions. This results in a problem that performance degradation is caused by mutual invalidation in the common region.
An object of the present invention is to provide a method of dynamically conducting specification of the dedicated regions and the common region, and thereby always conduct optimum cache coherency control to solve the problems caused by fixed regional division in the conventional technique.
Another object of the present invention is to solve the problems concerning the mutual invalidation in the conventional technique by using an optimum control system to the common region.
A representative mode of the present invention has the following configuration.
In a data processing system including a plurality of cache memories, such as a data processing system including a plurality of processors each having a cache memory, a cache history manager connected to respective caches and a coherency manager for conducting coherency control of respective caches are provided.
Each cache includes a cache data array and a cache directory. In the cache directory, line (or block or the like) status information is stored.
In response to a cache access request, a cache outputs a cache access request and cache status information to the cache history manager. Thereupon, the cache history manager generates new cache status information of a line accessed by the input from the cache, returns the new cache status information to the cache, judges an attribute (a dedicated region or a common region) of the line on the basis of the generated cache status information, and delivers the attribute of the judgment result to the coherency manager.
The attribute is judged to be the common region, only in the case where a line shared by a plurality of L2 caches in the past is canceled once by the invalidation type protocol and then accessed again. Otherwise, the attribute is judged to a dedicated region.
In the case where the attribute is a common region, the coherency manager uses an update type protocol or a store-through protocol. In the case where the attribute is a dedicated region, the coherency manager uses an invalidation type protocol or a store-in protocol.
Other modes of the present invention are made clear in description of preferred embodiments.