1. Technical Field
The present invention relates generally to cache memories and, more particularly, to an associative cache memory capable of decreasing power consumption through the reconfiguration of a K-way and N-set cache memory into M-unit, K-way, and N/M-set cache memory.
2. Description of Related Art
The majority of recent microprocessors employ an internal cache memory having a large storage for improving the performance of data access therein. The cache memory has a tag field composed of content addressable memory (CAM) cells and a data field composed of random access memory (RAM) cells. The tag field is accessed for determining whether a required command or data is stored within the current cache memory. This determination is performed whenever the processor fetches the command, or reads out or writes the data, by comparing an address held in itself with an input address. If the two addresses are the same, the cache memory reads out the command or data from the data field, or writes the data in the data field. As such activities with the tag field significantly affect the entire performance of the cache memory, aggressive developments relevant to the tag field in the cache memory have been taken to promote system performance. However, in the case of an embedded system such as a hand-held telephone, decreasing the power consumption is also very important.
FIG. 1 is a block diagram illustrating a general data processing system employing a cache memory, according to the prior art. The system of FIG. 1 is disclosed in U.S. Pat. No. 5,367,653, entitled "Reconfigurable Multi-Way Associative Cache Memory".
The data processing system includes a central processing unit (CPU) 100 which controls a main memory 200, and a multi-way associative cache memory 300. The main memory 200 and the cache memory 300 are usually a dynamic random access memory (DRAM), and a static random access memory (SAM), respectively. In a processing system, the cache memory 300 of an SRAM has a smaller storage capacity and a higher data access speed than the main memory 200 of a DRAM. Further, the cost per byte of the cache memory 300 is more expensive th an that of the ma in memory 200. As is known, CPU 100 also includes operational elements for data communications between an arithmetic logic unit (ALU), components of the CPU 100, and other circuit units.
The data and/or program command (represented as "data" hereinafter) can be stored in the cache memory 300. The data and an associative tag are stored in the cache memory 300. The address of the main memory 200 is stored in a main memory address register 110 located in the CPU 100.
The main memory address held in the main memory register 110 is divided into a few segments. That is, the main memory address includes byte selection address bits ADDR 0-1 used as a signal for selecting a single byte of a plurality of main memory bytes stored in the provided cache memory address, and word selection address bits ADDR 2-3 used as a signal for selecting a single word from a plurality of main memory words stored in the provided cache memory address. In addition, set select address bits ADDR 4-9 are used as a cache address for accessing a set of the cache memory 300. Tag address bits ADDR 10-31 represented as TAG are stored in a tag array of the cache memory 300. A cache memory controller 120 controls a signal transmission between the CPU 100 and the cache memory 300. The associative cache memory 300 is composed of tag and data arrays 320 and 340, respectively.
FIG. 2 is a diagram illustrating the associative cache memory 300 of FIG. 1 in further detail, according to the prior art. The associative cache memory 300 is a four-way set associative cache memory, including way_0302, way_1304, way_2306, and way_3308. Each way includes sixty-four sets. Since each way has the same circuit structure, only the structure of way_0 will be described.
The way 0_302 is formed of a buffer register 310, a tag array 320, a set selection unit 330, a data array 340, a set decoder 350, and a multiplexer 360.
The buffer register 310 latches the tag address bits ADDR 10-31 of the main memory address, provided by way of the cache memory controller 120. The address will be provided for a bit line signal of the tag array 320.
The tag array 320 is composed of 64 lines corresponding to SET_0 through SET_63, and each line stores 22 tag bits, respectively. Identical lines of the four ways 302, 304, 306, and 308 construct a single "set". That is, the first lines of the ways 302, 304, 306 and 308 are `set_0`, the second lines are "set_1", and so forth. Each line arranged in the tag array in a single way is referred to hereinafter 20 as a "set".
Match lines ML0.about.ML63 are connected to the sets SET_0.about.SET_63 of the tag array 320, respectively. The match line corresponding to the set of the tag array 320 which stores a tag equal to the address bits ADDR 10-31 latched in the buffer register 310 is set on a supply voltage level; the remaining match lines are set on a ground voltage level.
The set decoder 350 generates set enable signals SEN0 through SEN63 by decoding the set selection address bits ADDR 4-9.
The set selection unit 330 is formed of sixty-four transistors 330_0 through 330_63 connected between the match lines ML0.about.ML63 and the word lines DWL0.about.DWL63 of the data array 340. The transistors 330_0 to 330_63 selectively connect the match lines ML0.about.ML63 with the word lines DWL0.about.DWL63, responding to enable signals SEN0.about.SEN63 provided from the set decoder 350.
The data array 340 is composed of sixty-four sets as is the tag array 320. One set is composed of four words WORD0 through WORD3. Sets 340_0 through 340_63 of the data array 340 are connected to the sets of the tag arrays 320, through the word lines DWL0.about.DWL63, transistors 330_0.about.330_63, and the match lines ML0.about.ML63, respectively. The data array 340 provides the data, stored in the set associated with the activated word line of the world lines DWL 0 to DWL 63, to the multiplexer 360.
The multiplexer 360 selectively outputs one word (out of four words) provided from the data array 340 in response to the word selection address bits ADDR 2-3.
FIG. 3 is a diagram illustrating the tag array 320 of FIG. 2 in further detail, according to the prior art. The tag array 320 is constructed of a plurality of CAM cells 322 arranged in 64 rows and 22 columns. The word lines WL0.about.WL63 are arranged horizontally across pairs of bit lines BL0/BLB0.about.BL21/BLB21. The match lines ML0.about.ML63 are arranged parallel with the word lines WL0.about.WL63.
The pairs of bit lines BL0/BLB0.about.BL21/BLB21 transfer the tag address bits ADDR 10-31 stored in the buffer register 310 and the pair of data bits composed of complementary bits to the CAM cells 322. The CAM cells 322 store the single-bit data and perform the single-bit comparison (logical exclusive NOR (XNOR)) operation. The CAM cells 322 output the result of the comparison operation to the connected match line. Each of pre-charge transistors 324_0 through 324_63 is composed of a P-channel metal oxide semiconductor (PMOS) transistor, and includes a current path formed between the supply voltage and an end of the match line MLi(i=0,1, . . . or 63), a gate controlled by a pre-charge signal PRE provided from the cache memory controller 120.
FIG. 4 is a diagram illustrating the CAM 322 cell of FIG. 4 in further detail, according to the prior art. Referring to FIG. 4, the CAM cell 322 includes a N-channel metal oxide semiconductor (NMOS) transistor 402, and NMOS transistors 410 through 416, and a latch 404. During a pre-charge mode, the pre-charge transistor 324_0 through 340_63 is turned on in response to the pre-charge signal PRE, and the match line ML is pre-charged to high level. During an evaluation mode, it is evaluated whether or not data bits impressed on the pair of bit lines BL/BLB are identical with the data bits L1 and L2 stored in the latch 404. IF the data bits impressed on the pair of bit lines BL/BLB are identical with the data bits L1 and L2 stored in the latch 404, then the transistor 416 is turned off so that the match line ML keeps the pre-charged high level. In contrast, if the data bits impressed on the pair of bit lines BL/BLB are not identical with the data bits L1 and L2 stored in the latch 404, then the transistor 416 is turned on so that the match line ML is discharged to the ground voltage level. In this manner, the tag address bits ADDR 10-31 provided through the pairs of bit lines BL0/BLB0.about.BL21/BLB21 are compared with the data bits stored in the CAM cells 322, the match line associated with the complete identical set maintains the supply voltage level, and the rest of the match lines associated with the non-identical sets are discharged to the ground voltage level.
The cache memory 300 as described above performs the comparison operation in all the tag arrays 320 of four ways 302, 304, 306, and 308 by providing the main memory address bits ADDR 10-31 and ADDR 4-9 for all the four ways 302, 304, 306, and 308. In addition, in a single tag array 320, although the set to be compared with the main memory bits ADDR 10-31 is only one, all the match lines ML0.about.ML63 of 64 sets are pre-charged or discharged.
That is, the determination of a HIT/MISS is possible by performing the comparison operation in a single set of the ways corresponding to the set selection address bits ADDR 4-9; however, the comparison operation is performed in all the 64 sets. Therefore, in total, 256 match lines (4 ways.times.64 lines) perform the comparison operation, thereby causing unnecessary power dissipation.
To solve the aforementioned problems, U.S. Pat. No. 5,453,948, entitled "Associative Memory", and issued to Yoneda on Sep. 26, 1995, U.S. Pat. No. 5,469,378 entitled "Content Addressable Memory Having Match Line Transistors Connected in Series and Coupled to Current Sensing Circuit" issued to Albon et al. on Nov. 21, 1995, and U.S. Pat. No. 5,859,791 entitled "Content Addressable Memory" granted to Schultz et al. on Jan. 12, 1999 disclose the transistors, connected to a match line, in which the transistors are connected in series, not in parallel, and the end of the match line is connected to the ground voltage. Further, the transistor is turned on when the data stored in the latch is equal to the data provided through the bit line (HIT), while the transistor is turned off when the two data are unequal (MISS). Thus, in the case when all of the transistors connected to a match line are turned on, the other end of the match line connected to the data array decreases to the low level. Further, when even a single transistor is turned off, the match line maintains the high level of the pre-charge level. However, the manner of connecting the transistors to the match line in series lowers the operating speed. To improve the speed deterioration, in the patent of Albon et al., current sensing is used instead of voltage sensing, and in the patent of Schultz et al., the transistors in series are divided into a few blocks and the results of each of the blocks are combined. However, the manner of connecting the transistors with the match line in series has deficiencies such as an intricate circuit structure as well as an operating speed limit.