Cache memories can be used to reduce memory access time in response to a speed difference between microprocessors and memories. The cache memory operates based on two theories, one being temporal locality and the other one being spatial locality. Temporal locality means that a previously accessed memory address is accessed again in a short time. Spatial locality means that an address adjacent to an accessed memory address is accessed. In order to improve a processor's performance, recently, a large-capacity cache memory has been integrated on a single chip together with a processor core. For this reason, an area occupied by an on-chip cache memory is increased gradually. For example, caches in the StrongARM 110 processor occupy about 70% of the total chip area.
U.S. Pat. No. 5,455,925 discloses a device for maintaining coherency of data stored in external and internal cache memories, and U.S. Pat. No. 5,809,531 discloses a computer system for executing a program using an internal cache without accessing an external RAM.
FIG. 1 shows a block diagram of a microprocessor unit (hereinafter, referred to as “MPU”) in which level two (hereinafter, marked by “L2”) caches and a tag are embedded. If L2 caches are embedded in the MPU, reduction of line loading and power dissipation and improvement of an operation speed are expected. As a hit ratio is increased by use of a multi-way set scheme, an efficient and simple structure of the MPU can be accomplished. The MPU in FIG. 1 includes eight L2 caches each constituted of SRAM, one TAG, and an MPU core. An address and data are transferred between the L2 caches and the MPU core, and the address is transferred between the TAG and the MPU core. The TAG sends a set selection signal to the L2 caches. In the case of reading data, the TAG compares an address generated from the MPU core with a previous address and provides the set selection signal to the respective L2 caches. In each L2 cache, all sets of word lines are selected by the address and then data corresponding to the set selection signal from the TAG is read out.
As illustrated in FIG. 1, if a cache memory is integrated on a chip together with a processor core, the cache memory often consumes a significant portion of the power dissipated by a processor. For this reason, reducing the power dissipated by an on-chip cache is helpful to reduce the total power dissipated by the processor. In particular, in the case of an instruction cache, it is accessed every cycle while a program is being executed. This means that the power dissipation due to a memory access is significant. Accordingly, it is required to reduce the power dissipated by the cache memory integrated on a single chip together with the MPU.