1. Field of the Invention
The present invention relates to a technology for cycle simulation for a large scale integration (LSI).
2. Description of the Related Art
In recent years, as well as a general-purpose central processing unit (CPU) in personal computers, an embedded type CPU has been shifting to multi-core type. To shorten developing time for a system LSI that is getting more and more complicated, it is important to conduct coordinated designing of hardware and software from an early stage of designing. Since simulation speed in the existing simulators is not sufficiently high, development of a high-speed software/hardware coordinated simulator has been demanded.
FIG. 1 is a schematic of an LSI model 1200 to be simulated. As shown in FIG. 1, the LSI model 1200 includes an execution block consisting of processor core models PE# (# represents a number), a peripheral block model PB. A cycle model is obtained by executing the LSI model shown in FIG. 1 by an instruction set simulator (ISS).
FIG. 2 is a schematic of a cycle model when the LSI model shown in FIG. 1 is executed by the ISS. PE# shown in FIG. 2 indicates an instruction execution time (the number of cycles) of the processor core model PE#. Similarly, PB indicates an instruction execution time (the number of cycles) of the peripheral block model PB.
C# shown in FIG. 2 represents the number of necessary cycles in a corresponding instruction execution time. For example, the number of cycles C0 represents the number of necessary cycles when a processor core model PE0 is executed. In the conventional cycle model shown in FIG. 2, the cycle is calculated by adding cycles, and by acquiring difference in a cycle term. Such conventional technologies are disclosed in, for example, Japanese Patent Application Laid-Open Publication No. H5-35534, Japanese Patent Application Laid-Open Publication No. H4-352262, and Japanese Patent Application Laid-Open Publication No. 2001-256267.
However, in the conventional cycle model shown in FIG. 2, instructions are executed serially. Therefore, the simulation time increases as the instruction execution increases. On the other hand, when each execution block is executed in parallel to reduce the simulation time, data are not correctly reflected to a memory model or a register model. Therefore, incorrect simulation is executed.
FIG. 3 is a schematic for illustrating parallel simulation. The simulation will be described using only the processor core models PE0 and PE1 to simplify the description. The processor core model PE0 is first executed as an execution block. The number of necessary cycles C0 in this case is assumed to be 100. At this point, data D1 is written at an address RegA of a memory model.
After the execution of the processor core model PE0 is completed, the processor core model PE1 is executed as shown in a diagram (B) in FIG. 3. The number of necessary cycles C1 in this case is assumed to be “150”. The processor core model PE1 writes data D2 during the execution at the address RegA of the memory model that stores the data D1 in a 120th cycle.
After the execution of the processor core model PE0 is completed, the processor core model PE1 is again selected, and as shown in a diagram (C) in FIG. 3, the processor core model PE0 is executed. In this case, the number of necessary cycles in this case is assumed to be represented as C3. It is assumed that, during the term of C3, the data written at the address RegA of the memory address is read in, for example, the 110th cycle.
Although the processor core model PE0 is supposed to read the data D1 written at the address RegA in the memory model, the processor core model PE0 reads the data D2 written by the processor core model PE1 that has been executed earlier. Thus, incorrect simulation is executed.