1. Field of the Invention
The present invention relates to a cache memory observation device for a processor and a method of analyzing the processor.
2. Description of the Related Art
In a conventional method of analyzing a processor, an address, data and the like at the time of the data access are directly outputted without any change by means of a hardware tracing function, and then actions for improving the performance are examined based on the data. Further, as recited in No. H09-198275 of the Japanese Patent Applications Laid-Open, for example, there is conventionally available a computer system for performance improvement comprising a penalty analyzing tool for obtaining a penalty distribution in execution when the tracing data obtained by means of the hardware tracing function is analyzed and an improvement item recommending tool which proposes actions to be taken for the improvement based on a result of the analysis.
FIG. 10 is a block diagram showing a constitution of a semiconductor integrated circuit and an analyzing device according to a conventional technology. A semiconductor integrated circuit 10 comprises a processor 1, a memory 4, a tracing circuit 5 and the like. The processor 1 comprises an instruction executor 2 and a cache memory 3. An analyzing device 20 for analyzing the semiconductor integrated circuit 10 comprises a processing unit 11 comprising a microcomputer provided with a CPU, a memory (ROM and RAM) and the like, a keyboard 12 which is provided with character keys, numeric keys, function instructing keys and the like and accepts various key inputs and inputs such as instructions for debugging and performance analysis, a mouse 13 for accepting inputs of position data indicated by a mouse cursor, a display device 14 such as CRT, LCD or the like for displaying tracing results, analyzing results and various messages and the like, and a tracing unit 15.
The processing unit 11 plays a role of controlling the system and, in addition, executes computations for the debug outputted from the semiconductor integrated circuit 10, processes the tracing information, and make the tracing result memorize in HDD which is a memory device. An execution information relating to instructions executed by the processor 1, memory access information and cache miss information are outputted to the analyzing device 20 via the tracing circuit 5. The tracing information inputted to the analyzing device 20 is used for the debug, performance analysis and the like in the processor 1.
FIG. 11 is a block diagram showing a conventional constitution of the cache memory 3. The cache memory 3 is constituted according to the 4-way set associative method. The cache memory 3 comprises a pair of memories which are a tag memory 22 and a data memory 23. The pair of memories is called a cache line 24. In the example recited here, eight words constitute the cache line 24, and [4:2] of an address 21 is used for the selection of the cache line 24. [1:0] of the address 21 is used for the selection of bytes. The particular cache line 24 is selected for each address. A part of the address for the selection of the cache line is called an index, and a bit width of the index is determined by number of cache entries. In the present example, as the number of the cache entries is 28=256, the index is [12:5] of the address 21.
When a memory access is made to an address from the processor 1, the particular cache line is determined by the index, and data is read from the tag memory 22 and the data memory 23 corresponding to the determined cache line. Then, the data contents read from the tag memory 22 (hereinafter, referred to tag section) are compared to [31:13] which is high-order bits of the address 21 in the comparator 26, and the data in the relevant cache line is judged to be valid in the case where the compared data are coincident with each other in any of the four ways, which is called a cache hit. A valid data 28 is selected from the data memory 23 by a selector 27 and outputted to be delivered to the processor 1. Adversely, in the case where the contents of the tag section are different in all of the four ways, the data in the relevant cache line is judged to be invalid, which is called a cache miss.
In order to improve the performance of the processor 1, the address and the cache miss information of each memory access are effective because it can be detected where the cache miss was generated in the program if the address and the cache miss information of the memory access are known. When the program in the relevant section is then optimized so that the performance of the processor can be improved. There are various methods of optimizing the program. Generally, optimization of algorithm and optimization of a memory map are known.
In recent years, speeding up of the processor has been advanced, which generates a larger difference of a transfer rate to the outside of the chip. Accordingly, even if the tracing information is tried to output at a clock cycle of the processor, the transfer rate at a tracing terminal is not fast enough for the output, which results in the failure to accurately output the data. Therefore, as an conventional action for dealing with the problem, the information to be outputted was stored in a memory (tracing memory) or a buffer inside the chip and outputted after the program was executed. However, the cache access is extremely often generated, and a capacity necessary for the tracing memory and the like is significantly large in the case where the information relating to all of the cache accesses is tried to output as the tracing information.