1. Field of the Invention
The present invention relates to a semiconductor memory device used together with a CPU (Central Processing Unit), and more particularly to a semiconductor memory device storing part of a program designated by a programmer and outputting an instruction code when the CPU fetches the instruction, and a software development apparatus for a system using the same.
2. Description of the Background Art
Recently, as the processing speed of CPU has been increasingly accelerated, a CPU mounting a cache memory thereon has been actively developed. Generally, a cache memory is a small-sized, high-rate memory device which is connected between the CPU and a main memory and which temporarily holds the content of the main memory recently used by the CPU. Caches are classified into three types: “instruction cache” when a content to be held is an instruction from the CPU; “data cache” when the content is data; and “unified cache” when the content is both the instruction and the data.
The cache has a mechanism which utilizes a feature that the CPU locally accesses a memory, and is well known that the cache generally has an effect of improving the processing performance of the CPU. The effect is, however, dependent on a program performed by the CPU. For example, if a program performed by the CPU has a characteristics that there is no ordinality in memory access, cache hit rate lowers and the cache cannot improve the processing performance of the CPU.
In addition, since the cache inherently has a mechanism of detecting the consistency of a reference address outputted from the CPU with the content of the cache, the power consumption of the cache tends to increase. Due to this, even if a cache is introduced to the CPU, CPU power efficiency (processing performance/power consumption) does not always improve.
Furthermore, the content of the cache depends on the program performing history of the CPU. For this reason, even if the CPU accesses the same address, the CPU sometimes hits the cache and sometimes fails to hit the cache. Access cycles for instructions and data cannot be guaranteed. As a result, it is difficult for a system which uses a cache to optimize software having high real-time characteristics.
In order to solve these problems, the following techniques are disclosed.
1. Japanese Patent Laying-Open No. 10-340226
2. Japanese Patent Laying-Open No. 9-319657
3. U.S. Pat. No. 5,381,533
4. P. R. Panda et al., “Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications”, European Design and Test Conference, March 1997
According to Prior Art 1, a tag memory is divided to a first tag memory which includes a common bit group through respective ways in an address tag, and a second tag memory which includes individual bit groups for respective ways. An address indicated from a data processor is divided and the divided addresses are compared with each other for each of the first and second tag memories, thereby decreasing the power consumption of a microprocessor without lowering hit rate. Although Prior Art 1 contributes to improving power efficiency, it does not contribute to improving the performance of the CPU when the CPU performs a program.
According to Prior Art 2, an instruction cache is constituted of one line and one instruction, and an instruction stream buffer is provided between the instruction cache and a main memory. The instruction stream buffer is integer-multiple times as large as the cache line, and consecutive variable-length instructions read from the main memory can be written to the instruction stream buffer. Output to the instruction cache can be performed on each cache line unit. Although Prior Art 2 can eliminate useless instruction reading, it cannot improve power efficiency and cannot guarantee an instruction access cycle.
According to Prior Art 3, each of instruction trace segments includes instruction blocks, the first instruction in each block becomes the next instruction to a branch instruction, and the blocks are arranged so that the last instruction of each block becomes a branch instruction. It is thereby possible to improve hit rate and power efficiency. However, similarly to Prior Art 2, an instruction access cycle and a data access cycle cannot be guaranteed.
According to Prior Art 4, a small-sized, high-rate scratch-pad memory (hereinafter, abbreviated as “SPM”) is arranged in an address space different from a main memory. A programmer designates an instruction or data to be frequently accessed, and the designated instruction or data is held in SPM. As a result, the number of access cycles to SPM is guaranteed and a consistency detection function which is normally included in a cache becomes unnecessary, thus making it possible to decrease power consumption. Generally, however, the SPM is used as a data memory, i.e., used in place of a data cache, and difficult to mount as an instruction memory.