The present invention relates to a data processor including cache memories and particularly to technique for easily changing structure of cache memory or function thereof depending on the requested specification, for example, technique which is effectively applied to an RISC (Reduced Instruction Set Computer) for controlling the built-in devices.
Almost all data processors having the RISC architecture employ the pipeline technique to execute in virtual one instruction in one clock cycle (one pipeline stage) to realize high speed data processing and moreover are loading cache memories to realize high speed access of operands and instructions. Such RISC processor is described, for example, on pages 79 to 92 of the Nikkei Electronics, No. 601 (issued on Feb. 14, 1994 by Nikkei PB).
As is described in above reference, cache mistake of cache memory is correlated with cache memory size and cache line size.
Moreover, type of cache memory is divided to the direct map or set associative structure. The direct map selects cache line using a part of the significant bit side of the given address signal as an index address and compares the cache tag included in the selected cache line with the remaining bits of the significant bit side of the address signal. When these are matched, the direct map uses the data included in the selected cache line. In the set associative structure, the direct map structures are arranged in parallel to provide a plurality of cache lines in one index address and the cache hit rate thereof is considered higher than that of the direct map structure because of its property.
Whether the direct map or set associative should be used as the structure of a cache memory and what capacity the cache memory should have are often determined depending on the application field of the data processor comprising cache memories. Particularly, since it may also be required to load particular peripheral circuit modules depending on the requested specifications in such a field as controlling the loading of the devices, it is preferable to minimize the chip occupation area required by cache memories. Moreover, it is sometimes required to consider the power consumption of cache memories.
According to discussion by the inventors of the present invention about the cache memories comprised in the data processor, on the occasion of offering products of data processor which is loading cache memories of 4-way set associative structure of 4K bytes in total, it is substantially impossible to change, in view of reducing chip size, the relevant cache memories for independent use as the cache memories of the direct map by separating a part of way from the investigated design data (data of module library) of such cache memories and therefore total design change of cache memories is inevitably required. In addition, the set associative type cache memories operate simultaneously a plurality of ways, such cache memories require higher power consumption in comparison with the direct map.
Moreover, the cache memory of the direct map corresponding to capacity of one way of the cache memory of 4-way set associative type is naturally smaller in the chip occupation area and power consumption in comparison with the set associative type but is reduced in the cache hit rate. Even when it is attempted to increase the capacity of cache memory in the product of data processor comprising such cache memory of the direct map type, it is impossible to simply increase the storage capacity and it is also required to establish the total re-design of the cache memory.
The inventors of the present invention have proved that there are problems that the array of cache line of the original cache memory must be re-designed and it is not easy to provide a product of the data processor comprising cache memory in order to increase or decrease the capacity of original cache memory or to change the structure of cache to the set associative type from direct map type or vice versa at the time of offering a product of the data processor comprising cache memory.
Moreover, regarding the cache memory, the technique for assigning cache memory corresponding to the logical page of virtual memory or property of task under the multi-user or multi-task system is described, for example, in the Japanese Published Unexamined Patent Application Nos. SHO 55-8628(1980), SHO 62-276644(1987), HEI 4-49446(1992) and SHO 62-145341(1987). This technique is not intended to utilize as much possible the design resource of cache memory for changing the function or structure of the comprised cache memory on the occasion of offering a product of the data processor as explained above. The cache memory corresponding to logical page and task or the like is only the intrinsic one cache memory and does not have a means for easily changing the function or structure of the cache memory.
It is therefore an object of the present invention to offer a data processor which can change functions of the comprised cache memories under the condition that it is loaded to a system.
It is another object of the present invention to offer the technique for easily changing function or structure of cache memory in order to provide a product of the data processor comprising cache memory.
It is further object of the present invention to provide a data processor and a data processing system which can easily optimize the data processing capability and power consumption from the point of view of function and structure of the cache memory.
The aforementioned and further objects and novel features of the present invention will become apparent from the following description of the present specification.
The data processor of the present invention comprises a central processing unit, a plurality of area designating means for variably designating location and size of address area in the memory space to be managed by the central processing unit, detecting means provided corresponding respective area designating means to detect the access by the central processing unit to the address area designated by the relevant area designating means, a plurality of cache memories provided corresponding to individual means among a plurality of detecting means through the coupling with the central processing unit via the internal bus and a cache control means for controlling respective cache memories on the basis of the determination result of cache hit/mistake of cache memory and detection result of the access detecting means. For example, this data processor is formed on one semiconductor substrate.
The number of cache memories to be employed to the data processor of the present invention is determined freely within the range which is allowed by the chip occupation area. Logic of cache controller is changed only a little depending on the number of cache memories to be employed. Therefore, when the data processor is produced as the product (comprised circuits are added or functions of comprised circuits are changed depending on the specifications requested by users), address array or data array forming the cache line of the cache memory is never newly designed from the beginning.
Moreover, since address range for functioning individual cache memory can be varied, the cache memory can be functioned for each address area of, for example, every task or a group of tasks. Therefore, such a condition that cache mistake occurs continuously at the time of switching the task can be eliminated and data processing efficiency can be improved by utilizing the capability of cache memory to the largest extent. Such operation is never frozen by hardware. In other words, the cache memory to be used as the program area of task can be varied depending on the setting of the area designating means. Accordingly, assignment of cache memory can easily be optimized for the relevant system at the time of producing the data processor as the product to be applied for controlling the comprised devices. Assignment of such areas can also be applied to data area or the like used for each task.
The area designating means is provided with a register means for designating location and size of the address area and the central processing unit can set the relevant register means depending on its operation program. The task and data block for assigning the cache memory can be varied under the loading condition to the system or by the operation program executed by the software, namely CPU.
Unless otherwise specified particularly, a plurality of area designating means are capable of designating the location of address area through mutual overlapping. When individual cache memory is of the direct map type, a plurality of cache memories to which the overlapped address areas are set are functioning substantially in the same manner as the cache memory of set associative type in the area where the address areas are overlapped. When individual cache memories are of the n-way set associative type, the m cache memories to which overlapped address areas are set functions substantially in the same manner as the cache memories of (mxc3x97n) -way set associative type in the area where the address areas are overlapped. As explained above, when the areas are designated to partially overlap the address areas, functions of a plurality of cache memories can be varied to improve the cache hit efficiency. Moreover, such function can also be determined depending on the software as explained above. It is previously determined which processing routine should be located at which address area and necessary data processing capability should be obtained by executing such routine in which processing rate and when the cache object area is assigned for a plurality of cache memories, a plurality of cache memories are combined to operate as the set associative cache for the task which particularly requires higher processing rate or data area. Thereby, the system can be optimized by improving the cache hit rate of the necessary areas.
The cache memory explained above outputs the data in relation to hit to the internal bus depending on the cache hit condition. The cache control means explained above performs, to one cache memory, the cache filling operation to add the data relation to mistake to the cache line as a new entry when cache mistake of cache memory occurs. Therefore, when area designation is overlapped for a plurality of cache memories, the cache hit condition determined in individual cache memory is exclusively obtained in one cache memory. A plurality of cache memories does not determine the cache hit condition in parallel.
When one detecting means has detected access to the designated address area by the central processing unit, if the cache memory corresponding to such detecting means has generated a cache mistake, the cache control means performs the cache filling operation for the cache memory corresponding to such detecting means.
When a plurality of detecting means has detected access by the central processing unit to the designated address area, if all cache memories corresponding to such detecting means have occurred a cache mistake, the cache control means performs the cache filling operation to any one cache memory.
If the index operation and cache hit/mistake determining operation of cache memory are enabled when the corresponding detecting means has detected access to the designated address area, power consumption required by the cache line selecting operation can be reduced for a plurality of cache memories as a whole.
When the cache memory is enabled to operate by detecting access by the central processing unit to the designated address area by the corresponding detecting means, if the cache hit/mistake determination result is cache hit, a buffer means can output data to the internal bus from the data area of the cache line in relation to the cache hit.