This invention relates to a high performance data processor capable of efficiently using a high speed access mode of an external memory device.
A dynamic random access memory (DRAM) is frequently used as an external memory device of a data processor. Some kinds of DRAM have a high speed access mode and a normal access mode. A page mode is well known as one of the high speed access mode. According to the page mode, sequential provision of pulses of a column address strobe signal (/CAS) while maintaining a row address strobe signal (/RAS) to low level makes high speed read/write operation possible, compared with the normal access mode which activates both/RAS and/CAS at every time. In other words, in case where the DRAM is accessed again with the same row address as that of a previous access, e.g. in case of page hit, high speed access is possible, compared with a case of page miss.
In general, in an external memory device, an instruction address space and a data address space are separated from each other. In the DRAM having the high speed access mode and the normal access mode, a data access subsequent to an instruction access and an instruction access subsequent to a data access are executed at normal speed with page miss. On the other hand, an instruction access subsequent to another instruction access and a data access subsequent to another data access are usually executed at high speed with page hit.
Some conventional processors have a central processing unit (CPU), an instruction cache and a data cache. In the processor of a pipeline operation with such an architecture, there is a case where the instruction cache and the data cache miss at a same CPU cycle, so that the instruction cache issues an external instruction access requirement and the data cache issues an external data access requirement. Conventionally, when the external instruction access requirement and the external data access requirement are issued concurrently, the external data access requirement is processed with priority without exception. Therefore, the high speed access mode of the external memory device cannot be used efficiently. A concrete example thereof is explained with reference to FIG. 5.
FIG. 5 is a time chart showing an example of a pipeline operation of a conventional processor having a CPU, an instruction cache and a data cache. Suppose that: the processor of this example is a RISC processor which executes one instruction in one CPU cycle basically; and the pipeline has four stages of instruction fetch stage, load stage, execution stage and store stage. According to the pipeline operation, when a preceding instruction proceeds to a further stage, instructions other than an instructions which cannot proceed in the pipeline owing to a cache miss or the like out of the subsequent instructions proceed to a further stage. Also, suppose that it takes three cycles in case of DRAM access of normal mode (page miss) and one cycle in case of DRAM access of high speed mode (page hit) to supply an instruction or a data to the CPU from the time at cache miss.
The pipeline operation and the external memory access operation are explained, referring to FIG. 5.
An instruction A, an instruction B, an instruction C, an instruction D and an instruction E are respectively executed in this order in the pipeline operation. The instruction A and the instruction B are instructions for reading a data into the CPU. At cycle 1 and cycle 2, respective instruction fetches for the instruction A and the instruction B are executed with instruction cache hit. A data cache access for the instruction A and an instruction cache access for the instruction C are executed at a same cycle of cycle
When both the instruction cache and the data cache miss, an external instruction access requirement and an external data access requirement are respectively issued from the instruction cache and the data cache concurrently at the cycle 3. In this case, the access requirement for the preceding instruction, i.e. the external data access requirement of the instruction A is processed first, then the external instruction access requirement of the instruction C is processed.
Accordingly, the external data access that the instruction A requires starts at cycle 4. Suppose that a previous external access is an instruction access, the external access is executed at normal mode with page miss and is completed in three cycles of cycles 4, 5 and 6. At cycle 7, a data is supplied to the CPU, and the instruction A proceeds to the store stage and the subsequent instruction B proceeds to the execution stage. The instruction C stays on the instruction fetch stage since the instruction fetch therefor is not complete.
The external instruction access for the instruction C starts after the external data access for the instruction A is completed, namely starts at the cycle 7. In this case, the external instruction access for the instruction C is executed at the normal mode with page miss since the previous external access is the data access for the instruction A. Accordingly, the external instruction access for the instruction C is executed at cycles 7, 8, 9. The instruction C is supplied to the CPU at cycle 10, so that the instruction C proceeds to the load stage and the subsequent instruction D proceeds to the instruction fetch stage at cycle 10.
When the access to data cache for the preceding instruction B at cycle 7 misses, the external data access for the instruction B starts at the cycle 10 after the fetch for the instruction C from the external memory is completed. Since the previous external access is the fetch for the instruction, the external memory access is executed at the normal mode with page miss at cycles 10, 11, 12. The data is supplied to the CPU at cycle 13. The instruction B proceeds to the store stage and the subsequent instruction C proceeds to the execution stage at cycle 13. Wherein, since the instruction fetch for the instruction D is not completed owing to a cache miss of the instruction D at cycle 10, the instruction stays on the instruction fetch stage.
After the data fetch for the instruction B is completed, the external instruction access for the instruction D starts at cycle 13. In this case, since the previous external access is the data fetch, the external memory access is executed at the normal mode at cycles 13, 14, 15, so that the instruction D is supplied to the CPU at cycle 16. At the cycle 16 the instruction D proceeds to the load stage and the subsequent instruction E proceeds to the instruction fetch stage.
According to this example, the instruction D can proceed to the load stage at cycle 5 upon always hit of both the instruction cache and the data cache. Actually, however, as described above, the instruction D shall proceed thereto at cycle 16 owing to the miss, which is eleven-cycle delay.