1. Field
The following description relates to a processor simulation technology that simulates the performance of a processor executing application programs and which may be used to improve the processing of an application program.
2. Description of the Related Art
The concept of reconfigurable computing is based on the arrangement of a processor along with an array of reconfigurable hardware elements. The behavior of such reconfigurable hardware elements, including data flow between the reconfigurable hardware elements, may be tailored to perform a specific task. A reconfigurable processor may have a processing performance that is the same as the performance of dedicated hardware.
A reconfigurable array includes a plurality of processing elements or functional units. The size of such a functional unit or the complexity of a functional unit, for example, the number of ALUs or registers that are included in a functional unit, is referred to as granularity. A processor whose granularity is larger may be referred to as a Coarse-Grained Reconfigurable Architecture (CGRA), and a processor whose granularity is smaller may be referred to as a Fine-Grained Reconfigurable Architecture. The configuration between individual functional units may be dynamically set when a specific task is performed based on configuration information. For example, a routing path between individual functional units may be dynamically established based on configuration information that may be read from a configuration memory, for example, by a multiplexer. An execution file may be executed based on the established architecture. Instructions of an execution file may be stored in an instruction memory. The instructions may be in the form of a sequence of instruction codes and operand information which are to be executed by respective processing elements.
A CGRA may be implemented as an accelerator that may be used to improve the execution speed of a particular section of a program, for example, an iteration loop. In comparison with existing Application-Specific Integrated Circuits (ASIC), CGRA processors are being targeted as an efficient means for a next-generation digital signal processor (DSP) because they have a high degree of flexibility and a performance level that is similar to the ASIC.
In order to estimate the degree of performance that a certain application program can obtain using a CGRA-based platform and to correct source codes to more efficiently use the CGRA, a simulator that allows performance debugging may be used. A simulator for CGRA should have sufficiently high speed such that simulation does not become the bottleneck of software development. In addition, the simulator should ensure a cycle accuracy for accurate performance debugging.
Because the CGRA decides all schedules during a compile time, like a Very Long Instruction Word (VLIW) machine, a simulator may ensure a cycle accuracy. However, unlike a general processor that performs all operands passing between functional units through a register, the CGRA passes operands between functional units using significantly complicated interconnection logics. For this reason, the simulator is burdened with having to model the operations of the numerous interconnection logics. Monitoring the large amount of interconnections is a main factor that limits simulation speed.