1. Field of the Invention
The present invention relates to reconfigurable hardware logic, and particularly to a warp processor for dynamic hardware/software partitioning.
2. Description of the Related Art
Previously, designers have utilized dynamic optimizations to improve software performance. Such approaches are especially effective because they are transparent, requiring no extra designer effort or special tools. However, since the optimizations are restricted to software, improvements are limited.
DYNAMO is one such dynamic binary optimizer developed by HEWLETT PACKARD, and BOA is a similar optimizer by IBM for the POWERPC. Related efforts include the CRUSOE and EFFICEON processors by TRANSMETA, which are very long instruction word (VLIW) processors that dynamically translate x86 instructions into VLIW instructions.
Run-time reconfigurable systems achieve better speedups than dynamic software optimization, but require hardware regions to be pre-determined statically with designer effort.
DISC is an example of a run-time reconfigurable system that dynamically swaps in hardware regions into a Field Programmable Gate Array (FPGA) when needed during software execution. CHIMAERA is a similar approach that treats the configurable logic as a cache of reconfigurable functional units.
Other examples of run-time reconfigurable systems include a Dynamically Programmable Gate Array (DPGA) used to rapidly reconfigure the system to perform one of several pre-programmed configurations.
Hardware/software partitioning is the process of dividing an application into software running on a microprocessor and hardware co-processors. Partitioning is a well-known technique that can achieve results superior to software-only solutions. Partitioning can improve performance and even reduce energy consumption. For example, the appearance of single-chip platforms incorporating a microprocessor and FPGA on a single chip have recently made hardware/software partitioning even more attractive.
Such platforms yield more efficient communication between the microprocessor and FPGA than multi-chip platforms, resulting in improved performance and reduced power. In fact, such single-chip platforms encourage partitioning by designers who might have otherwise created a software-only design. By treating the FPGA as an extension of the microprocessor, a designer can move code regions from the software executed by the microprocessor onto the FPGA, resulting in improved performance and usually reduced energy consumption.
However, hardware/software partitioning has had limited commercial success due in part to tool flow problems. First, a designer must use an appropriate profiler to detect code regions that contribute to a large percentage of the execution time in the software. Second, a designer must use a compiler with partitioning capabilities to partition the software; however, such compilers are rare and often resisted because companies may have trusted compilers. Third, the designer must apply a synthesis tool to convert the partitioning compiler's hardware description output to a configuration for the FPGA.
A tool flow requiring integration of profilers, special compilers, and synthesis is far more complicated than that of typical software design, requiring extra designer effort that most designers and companies are not willing to carry out. Thus, the more transparent one can make hardware/software partitioning, the more successful hardware/software partitioning may be.
Binary-level hardware/software partitioning approaches are more transparent as compared to traditional source-level partitioning methods. Binary partitioning has the advantages of working with any software compiler and any high-level language. In addition, binary partitioning considers assembly code and object code as hardware candidates. Software estimation is also more accurate in a binary-level approach. Previous work has shown that binary partitioning achieves similar speedups to source-level partitioning for numerous benchmarks.
Thus, there is a need in the art for automatic and transparent hardware/software partitioning. There is further a need in the art for automatic compilation of implementations or configurations for configurable logic. The present invention meets these needs.