In computer systems, as processors increase in speed, the difference in processing speeds between high-speed processors and low-speed main memory becomes a serious problem in improving the processing speed of the entire system. To solve this problem, a means known as cache or caching is used.
Caching is a method of taking account of the difference in speed between them by mounting a high-speed cache memory between high-speed processors and low-speed main memory. In many cases, the term “cache” is used for cache memory. When the processor accesses data, it first searches for data in the cache memory, and only if there is no data, the processor accesses the main memory. Cache memory is comprised of SRAM, which is very fast, in many cases, and is fast enough to synchronize with the processor, but it is undesirable to equip with a large-volume cache memory because it is expensive. Recently, cache has adopted a multistage structure to support even higher speeds of processors in many cases. Recent CPUs use very high-speed and expensive memory in the tens of kilobytes for the primary cache that is closest to the processors, and memory that is a lower speed but less expensive than the memory for the primary cache in hundreds of kilobytes for a secondary cache between the primary cache and main memory. Some computer systems use a third cache between the secondary cache and main memory.
In many cases, two caches—instruction cache and data cache—are mounted for the primary cache for recent processors. As the names suggest, instruction cache handles data regarding instructions, and data cache handles other data. When, for example, a calculation such as “1+3” is performed, the instruction cache handles the “+”, which is an addition operation command, and the data cache handles the “1” and “3”. Furthermore, in order to improve the hit rate of cache, the following method called set-associative is used: dividing cache memory into multiple blocks to enable the storage of multiple pieces of data having the same index therein. For detailed information about set-associative, see Japanese Unexamined Patent Application Publication H05-225053 (Patent family of U.S. Pat. No. 5,353,424 in Japan).
Next, operating systems (OS) used in recent computer systems have kernel mode and user mode. These support different working modes of the processors. The OS uses the privilege mode of processors to execute programs for kernel mode, and the non-privilege mode of CPU or cache to execute programs for user mode.
Access rights to hardware resources vary between privilege mode and non-privilege mode. Processors allow access to any hardware resources, such as main memory, hard disks, networks, and printers, for programs working in privilege mode. Conversely, processors restrict access to hardware resources for programs working in non-privilege mode. The OS can protect itself from crashes or invalid programs by properly selecting kernel mode or user mode in which the programs are executed.
Recent processors are equipped with a plurality of working modes, each working mode having a different privilege level associated with it. If privilege levels differ, accessible memory areas also differ, and available instructions and access rights to hardware resources vary. Many operating systems use two of the possible privilege levels for kernel mode and user mode. Working mode for processors is switched by instructions from the OS.
It will be appreciated that a high frequency data group (i.e. a data group to which frequent access is needed) in a given working mode (e.g. kernel mode) differs from a high frequency data group in another working mode (e.g. user mode) with a different privilege level. Therefore, immediately after processors change working modes, there are many cases in which data used in that working mode is not stored in the cache, and the processors cannot obtain the required data unless they access the main memory. Requiring processors to access the main memory, which is slow speed, increases waiting time for processors and makes processing speed slow. In addition, because the electrical power required for processors to access the main memory is greater than the electrical power required for them to access the cache, extra electrical power will be consumed. Furthermore, because processors consume electrical power even while the processors wait for data from the main memory, this is also a waste of power. Moreover, for processors equipped with pipelines, if data does not come, there is a possibility of pipeline obstruction in which the pipeline process may be obstructed, leading to further wasted consumption of electrical power. Such wasted consumption of electrical power means shortening of the work time of electronics devices driven by battery, and that is especially undesirable for mobile electronics devices such as laptop computers and mobile phones that use batteries on many occasions.
The prior art is described by using FIG. 3, which is a block diagram for describing conventional computer systems. A computer system 1008 has a CPU 108 and a main memory 27. The CPU 108 is one in which a core part of operation and control, the cache, and an external interface are integrated on a board into a single LSI chip, called the central processing unit (CPU). In addition, part of the DSP has a structure in which an operation core, a cache, and an external interface are integrated into a single LSI chip, similar to the CPU 108. In FIG. 3, the core part for operation and control is represented by a processor 11, and the processor 11 contains a data decoder, a data control unit, a scheduler, an arithmetic logical operation unit, and a floating-point operation unit therein. In some cases, the processor 11 itself is called the CPU. The CPU 108 is equipped with a two-stage cache, and furthermore, a primary cache is divided into instruction cache and data cache. In FIG. 3, the controller for primary instruction cache is represented by 13, and the memory for primary instruction cache is represented by 15. The controller for primary data cache is represented by 17, and the memory for primary data cache is represented by 19. The primary data cache controller 17 and primary data cache controller 19 are connected to a secondary cache 21. The CPU 108 has a bus interface 23 to connect to external devices, and the CPU 108 is also connected to a main memory 27, which is the main memory of the computer system 1008. In many cases, high-speed and expensive SRAM modules are used for cache memories 15, 19, and 21, and DRAM modules, which are lower speed and less expensive than SRAM, are used for the main memory. The CPU 108 is mounted on a lead frame and enclosed by a molded resin, and this is sold as a product. When the computer system 1008 is turned on, an OS 29 is loaded in the main memory 27 to make the CPU 108 work as an information processing device.
For example, the processor 11 has four working modes in which the privilege levels differ, and their working modes are called working modes 0 through 3, respectively. Working mode 0 is the highest privilege level that allows access to the hardware without any restriction. Working mode 3 is the lowest privilege level that significantly restricts access to the hardware. The OS 29 changes working mode for the processor 11 to 0 when executing programs for kernel mode, and changes it to 3 when executing programs for user mode. The processor 11 has CPU instructions for switching working modes, and the OS 29 can use these CPU instructions to switch working mode for the processor 11.
When the OS 29 executes programs for kernel mode, data frequently used in working mode 0 is stored in the cache memory 15. Here, the OS 29 may switch the working mode for the processor 11 to 3in order to switch to user mode. The memory area frequently used in working mode 3, however, completely differs from the memory area frequently used in working mode 0, in many cases. Therefore, the possibility that data required for working mode 3 is not stored in the cache memory 15 is very high immediately after the working mode is switched to 3. The processor 11 must retrieve required data from the main memory 27 immediately after the working mode is switched to 3. Once data is retrieved, it is stored in the data cache memory 15, and when the same data is required the next time, it can be obtained from the cache memory 15.
However, data stored in working mode 0 is removed from the cache memory 15 by being overwritten with data used in working mode 3 in this process. Therefore, when a switch from user mode to kernel mode is performed, and the processor 11 changes working mode to 0, the processor 11 must retrieve required data from the main memory 27 because data required for the processor 11 is not stored in the cache memory 15. In this manner, access to the main memory 27 by the processor 11 occurs in the conventional computer system 1008 because a switching is performed between kernel mode and user mode. As described above, accessing the main memory 27 takes longer time than accessing cache memory 15 and requires extra electrical power. Because some applications switch between kernel mode and user mode hundreds or thousands of times per second, the effect on delaying the process and electrical power consumption is significant.