1. Field of the Invention
The present invention relates to a microprocessor having a plurality of external buses, each having a different bus width, and also having a plurality of cache memory units with plural ways. More particularly, the invention relates to a microprocessor which fetches instructions and data destined to specific cache memory units via external buses having different bus widths by switching either the number of ways or the number of entries in accordance with the set bus mode.
2. Description of Related Art
Normally, in order to process a program including instructions and data stored in the external memory units, the conventional microprocessor needs to fetches instructions and data from these external memory units via external buses. If the speed of accessing any of these external memory units is slower than the program-processing time on the part of the microprocessor, or if it is desired to apparently accelerate the slow access speed and reduce the requirements of the microprocessor for fetching instructions and data, in many cases, the microprocessor uses the built-in cache memory units.
The cache memory unit is substantially a memory circuit composed of memory cells operating at a relatively fast speed. If the microprocessor has a certain number of built-in cache memory units for storing the fetched instructions and data, the microprocessor does not need to access the external memory from the second round of fetching instructions and data, but instead, the microprocessor can secure all the needed instructions and data merely by accessing those cache memory units. Nevertheless, when fetching instructions and data for the first time or when all the memory regions of the cache memory units are occupied creating a need to rewrite these instructions and data, then the microprocessor is obliged to fetch instructions and data from the external memory via an external bus.
In this case, those instructions and data of a specific external memory are delivered to the microprocessor by means of "block transfer". The term "block" represents a minimal unit of the divided region of the external memory storing instructions and data. Normally, a block is divided into 16 bytes quadruple to the width of the external bus or into 32 bytes. The term "block transfer" designates such an art for fetching both instructions and data of a specific external memory to the built-in cache memory units per block unit.
FIG. 1 designates the timing chart while the "block transfer" operation is underway. The reference numerals C1 through C7 respectively designate one-clock cycle. Concretely, the flowchart indicates that seven clock cycles are needed for executing the "block transfer" operation which transfers either 4 bytes or 8 bytes every clock cycle against an address.
First, when clock C1 is present, the microprocessor compares a tag with the address of a specific cache memory. Simultaneous with the output of a cache miss signal indicating the comparative result, the block transfer operation starts off. Simultaneous with the output of the cache miss signal, addresses of the instructions and data of the external memory for delivery to an external address bus are output, thus causing a bus acknowledge signal advising of the activated bus access to be fed back. Next, an operation for accessing the external memory starts off from the clock C2 in order to input bus-end signal, instructions, and data, by the time the clock C4 rises two clocks later on. Then, either the instruction or data is continuously fetched to the microprocessor by the time the following clocks C5, C6, and C7 respectively rise. Either instruction or data is set to a line buffer by continuously accessing memory four times, and then either instruction or data is written into these cache memory units while the clock C7 cycle is underway. In this way, if the external bus has 4 byte width while executing the block transfer operation, then 16 bytes are fetched. On the other hand, if the external bus has 8 byte width, then 32 bytes are fetched.
FIG. 2 designates a schematic block diagram of a conventional microprocessor having a plurality of external buses each having different bus width and a pair of built-in cache memory units. The reference numeral 11 shown in FIG. 2 designates a microprocessor which has a pair of two-way cache memory units 12a and 12b each having 32 bytes of line size. These cache memory units 12a and 12b are respectively composed of a tag region storing the upper 24 bit tag of 30-bit address signal and a 16-byte data storing region storing either instructions or data held in specific addresses of an external memory 40 specified by address signal. When the 4-byte byte bus mode is underway, these cache memory units 12a, 12b operate in the two-way set associative form, in which cache memory units 12a, 12b are respectively accessed by applying an identical entry number.
The microprocessor 11 fetches instructions and data of a specific external memory (not shown) via an external bus 8 or 9 having either 4-byte or 8-byte width. The external bus 8 is connected to the upper 4-byte data bus 10a of an internal data bus 10 having 8-byte width, whereas the other external bus 9 is connected to the upper 4-byte data bus 10a and also to the other lower 4-byte data bus 10b, respectively. The upper 4-byte data bus 10a of the internal data bus 10 is connected to a line buffer 13 having 32-byte width via a route 14, whereas the lower 4-byte data bus 10b of the internal data bus 10 is also connected to the line buffer 13 via a route 15. The line buffer 13 has a line size which is identical to those of the cache memory units 12a and 12b, where the line buffer 13 sets 4-byte or 8-byte instructions or data by the number of bytes available for executing the block transfer. The line buffer 13 separately writes the fetched 16-byte instructions and data into those cache memory units 12a and 12b via routes 16a and 16b. The line buffer 13 also separately writes the 32-byte instructions and data into these cache memory units 12a and 12b via routes 17a and 17b.
Next, functional operation of this conventional microprocessor when fetching instructions and data into builtin cache memory units 12a and 12b is described. The description below refers to the case in which both the instructions and data are respectively fetched from the 8-byte external bus 9 for example (this is hereinafter called the 8-byte bus mode).
When a cache miss signal is output from a tag comparator (not shown) of the cache memory units 12a and 12b, the sequence of the block transfer starts. While the block transfer operation is underway, those instructions and data of an external memory are respectively delivered from the external bus 9 having 8-byte width to the upper 4-byte data bus 10a and to the lower 4-byte data bus 10b of the internal data bus 10, respectively. After arriving at the internal data bus 10, the 8-byte instructions and data are delivered to the line buffer 13 via the routes 14 and 15. By applying the block transfer operation, of 32 bytes quadruple to 8 bytes these instructions and data in 4 clock cycles are secured in the line buffer 13 while the 4th clock cycle is underway. These instructions and data are then registered in the data storing region of an entry 12, for example, of the way 0 cache memory unit 12a or in the data storing region of an entry 12, for example, of the way 1 cache memory unit 12b via the route 17a or 17b. Since 32 bytes of instructions and data are registered every round of the block transfer operation, the line size of one-way of these cache memory units 12a and 12b corresponds to 32 bytes. Since every data is independent, tag B of the entry 12 of the way 0 and tag D of the entry 12 of the way 1 are respectively registered in specific values which are different from each other. The cache memory units having the above structure are operated in the two-way set associative form based on the 8-byte bus mode.
Next, functional operation of the microprocessor when fetching instructions and data from an external 4-byte bus 8 is described below (this is hereinafter called the 4-byte bus mode). While the block transfer operation is underway, those instructions and data of the external memory are transmitted from the external bus 8 having 4-byte bus width solely to the upper 4-byte data bus 10a of the internal data bus 10. As was done during the 8-byte bus mode, when fetching instructions and data every round of the block transfer operation, those instructions and data of 16-byte quadruple to the upper 4-byte data bus 10a of the internal data bus 10 are secured in the line buffer 13 via the route 14. In this case, the line buffer 13 does not need to accommodate 32 bytes, but instead, it merely uses 16 bytes. Concretely, the 16 byte portions shown in FIG. 2 with slash lines remain unused in the external 4-byte bus route. Those instructions and data are registered in the data storing region of the entry 10, for example, of the way 0 cache memory unit 12a, for example, or in the data storing region of the entry 10, for example, of the way 1 cache memory unit 12b via the route 16a or 16b. Since the receivable instructions and data are of 16 bytes, the line size of the cache memory units 12a and 12b does not need to accommodate 32 bytes, but it merely uses 16 bytes. Likewise, the 16 byte portions shown in FIG. 2 with slash lines are not needed when the 4-byte bus mode is underway. Since each data is independent, tag A of the entry 10 of the way 0 and tag C of the entry 10 of the way 1 are respectively registered in specific values which are different from each other.
When using all the line sizes of 32 bytes while the 4-byte bus mode is underway, the microprocessor needs to execute the block transfer operations twice. In other words, by fetching the 16 byte instructions and data twice, these are written into all the 32-byte line size. This in turn indicates that are longer time is needed for fetching these instructions and data relative to execution of an additional block transfer operation. As is done while the 8-byte bus mode is underway, even when the 4-byte bus mode is on, any conventional microprocessor operates itself in the two-way set associative form.
As mentioned above, any conventional microprocessor which has a plurality of external buses needs to adjust the line size of cache memory units to a certain value quadruple to the wide bus width (like 8 bytes for example) of the external bus when fetching instructions and data to the cache memory units by applying the block transfer operation, and when fetching instructions and data from another external bus having narrow bus width, a certain region of each cache memory unit remains unused. This in turn generates useless space in each memory unit, and yet, obliges the entire system to execute plural cycles of the block transfer operation, thus causing the system to extend the duration of the block transfer operation.