Embedded systems combine a processor executing software with dedicated logic. The embedded systems may be implemented within an integrated circuit, such as a programmable logic device. Programmable logic devices (PLDs) exist as a well-known type of integrated circuit (IC) that may be programmed by a user to perform specified logic functions. There are different types of programmable logic devices, such as programmable logic arrays (PLAs) and complex programmable logic devices (CPLDs). One type of programmable logic device, known as a field programmable gate array (FPGA), is very popular because of a superior combination of capacity, flexibility, time-to-market, and cost.
PLDs allow a designer to implement an embedded system with both processors as “software” components, and dedicated hardware peripherals as “hardware” components. These hardware and software components communicate with each other through some specific bus interfaces. In the design phase, a target application is partitioned into a number of portions. Some portions are executed on the software components as software programs, while the other portions are executed on the hardware components.
The behavior of the hardware and software components, as well as the bus interfaces between them, can be simulated using co-simulation techniques. In some instances, multiple software simulators can be used to simulate the hardware and software components of an embedded system. In yet other instances, the PLD devices themselves are used in the simulation as emulators. In this case, a portion of a design physically runs on a PLD device while the rest of the design is simulated by the software simulators running on a computer. A hardware co-simulation interface controls the simulation progress of the software simulators and the emulation hardware, and exchange simulation data between them when needed.
In particular, due to the various design trade-offs offered by implementing a functionality as either a software components or a hardware component, it is desirable to measure the performance of an embedded system running on a PLD. A performance metric may be based on the number of clock cycles the processor spent executing specified portions of the software code, or based on the number of clock cycles a dedicated hardware peripheral used to finish processing one input data sample. Techniques for measuring these performance metrics are referred to as profiling. If a specific hardware-software partitioning does not meet the necessary design requirements, a designer may choose to re-partition the system, or optimize the bottle-neck software and hardware components.
One kind of profiling technique is based on cycle-accurate software simulation. When using this technique, the status of the software and hardware components is recorded during the cycle-accurate software simulation. The recorded status information is then analyzed to obtain the profiling data of interest. These cycle-accurate simulation based techniques are inefficient due to the large amount of computation required by the cycle-accurate software simulation and thus have limited uses in practice. Some profiling techniques, such as the GNU gprof tool available from the Free Software Foundation, Inc., insert code into the software program running on the processor to generate specific interrupts. The profiling data is obtained in the service routines for these interrupts. There are a few limitations of such tools: (1) such tools provide intrusive profiling in that the inserted code and interrupts change the cycle-by-cycle behavior of the processor system (i.e., intrusive); (2) such tools support limited profiling precisions as responding to interrupts is usually an expensive operation; and (3) such tools only support profiling of events generated by the processor and do not support profiling events generated by peripherals coupled to the processor, and thus is not suitable for hardware-software co-design.
Some on-chip profiling techniques make use of on-chip PLD resources, such as significant portions of on-chip memory to store the profiling data. PLD resources are often scarce and are demanded by the dedicated logic of the embedded system. Competition for PLD resources may limit the amount of profiling data that can be retrieved by the user, and thus the effectiveness of such profiling techniques.
Accordingly, there exists a need in the art for a method and apparatus for profiling a hardware/software embedded system that overcomes the aforementioned deficiencies.