1. Field of the Invention
This invention relates to a microprocessor, and more particularly to a cache DataRam with improved area efficiency, performance and power savings of microprocessor.
2. Description of the Related Art
Generally, for a high performance microprocessor, a superscalar structure and accompanying multi-data accessible cache structure are essential. These cache covers about 50% of the microprocessor chip area, so compact and effective cache is required which still can support superscalar operation.
In the conventional superscalar microprocessor, two pipeline of the microprocessor check dependency on each other during operation. If no dependency is checked, each pipeline simultaneously requests data from cache.
FIG. 1 is a schematic diagram of a conventional cache DataRam 100 structure, of which an index address of X pipeline (XA [11:5]) 20 and an index address of Y pipeline (YA [11:5]) 21 are inputted to the corresponding pipeline address decoders 22 and 23. The index addresses 20 and 21 are respectively decoded by the address decoders 22 and 23. Memory cell arrays 24 and 24' of two port configuration are independently driven by corresponding decoded address signals. Then, cache lines having data of the memory cell arrays are simultaneously read by the driven sense amplifiers. After reading the data, valid way data is selected in response to hitway signals, i.e. X pipeline hitway [3:0] signal 25 and Y pipeline hitway [3:0] signal 25' generated by cache tag block (not shown). Then, the selected way data is sent to aligner block 26. In the aligner block 26, the read data in each pipeline is aligned by an X pipeline shifter 28 or Y pipeline shifter 28' of read path block 27 and sent to 4 byte X pipeline data bus or 4 byte Y pipeline data bus. Finally, the data is outputted to an execution unit or a command decoder unit.
As described above, the conventional cache DataRam has two port ram cell configuration 24 and 24', and includes decoders 22 and 23 dedicated to each pipeline (i.e. X and Y pipeline) and an aligner block 26 in order to solve simultaneous access through multiple pipelines. In designing 16 KB cache with 4 way set associative structure and 32 byte DataRam cache line entry, the block pitch in the conventional two port ram cell configuration is increased by thousands of micrometers (.mu.m) in comparison with one port ram cell configuration. Also, in consideration of common data granularity, which is generally 4 bytes and required by each pipeline, the above described conventional data ram structure drives unnecessary cache lines. Further, although data sensing is independently performed on each pipeline so that different cache lines are accessed, since the whole cache lines are sensed repeatedly, the power consumption is very large.