The present invention relates to the internal configuration of a microprocessor, that can read and write data more quickly than external memories.
Because of advances in processor speed, the speed difference between processor and a main memory has increased. In order to minimize the effect of the speed difference, a high-speed cache memory with small memory capacity may be arranged between the processor and the main memory, If data required by the processor exists in the cache memory, data read out from the cache memory is delivered to the processors. Therefore, the main memory is accessed less frequently, and the processor can perform processes at higher speeds.
However, when the capacity of the cache memory is large, it takes a long time to determine whether particular data exists in the cache memory and to read or write data from the large memory array; accordingly, performance of the memory access deteriorates. Therefore, it is inefficient to enlarge the memory capacity so much. Furthermore, in order to process a large amount of data using the cache memory, it is necessary to frequently refill the cache memory; accordingly, performance penalty of cache miss is not negligible.
Furthermore, when accessing frequency for the same address in the cache memory is high, the cache hit rate is improved; as a result, it is possible to execute the processes at high speed. On the other hand, when the accessing frequency for the same memory address is low, the cache miss rate becomes high; as a result, performance of the memory access deteriorates.
For example, to display a moving image in three dimensions, it is necessary to transmit the image data between the memory and the processor at high speed. Accordingly, it is desirable to store the image data in the memory accessible with almost the same speed as that of the cache memory. However, because the amount of the image data is high and the accessing frequency for the same memory address is low, it is not desirable to store the image data in the cache memory.
An object of the present invention is to provide a microprocessor being able to read and write data with almost the same latency as that of the cache memory and including a RAM available for purpose which is different from the cache memory.
In order to achieve the foregoing object, a microprocessor comprising:
a load/store instruction executing block for executing a load/store instruction; and
a RAM (Random Access Memory), from and to which said load/store instruction executing block is able to read and write data, said RAM exchanging data with an external memory through a DMA (Direct Memory Access) transfer
Because a RAM according to the present invention is able to read and write data from and to a load/store unit and to exchange data with an external memory through a DMA, the RAM is available as a temporary work area to process a large amount of data, such as image data.
Furthermore, when a processor having an instruction set is emulated, the RAM according to the present invention is available as the temporary work area which reads the emulated instruction set, converts to a native instruction set, and fabricates the native instruction set. The code in the native instruction set which is generated in the RAM may be edited in the RAM for the purpose of performance improvement, for example, reorder instructions to solve read after write hazard on general purpose registers.
Furthermore, if a store buffer is provided, even if an access to the RAM by the load/store instruction unit conflicts with an access to the RAM by a DMA transfer, a pipeline stall does not occur.
Furthermore, if the RAM has a snoop function, it is possible to take out data being stored in the memory as necessary, and the program design is simplified.