1. Field of the Invention
The present invention relates to signal processing devices adapted for simultaneously processing at least two threads in a multi-processing or multi-threading manner, to methods for executing an application on such a signal processing device, to methods for compilation of application source code in order to obtain compiled code being executable on such a signal processing device, to methods for adjusting applications to be executed on such a signal processing device, to a computer program product for executing any of the methods for executing an application on such a signal processing device, to machine readable data storage devices storing such computer program product and to transmission of such computer program products over local or wide area telecommunications networks.
2. Description of the Related Technology
Nowadays, a typical embedded system requires high performance to perform tasks such as video encoding/decoding at run-time. It should consume little energy so as to be able to work hours or even days using a lightweight battery. It should be flexible enough to integrate multiple applications and standards in one single device. It has to be designed and verified in a short time to market despite substantially increased complexity. The designers are struggling to meet these challenges, which call for innovations of both architectures and design methodology.
Coarse-grained reconfigurable architectures (CGRAs) are emerging as potential candidates to meet the above challenges. Many designs have been pro posed in recent years. These architectures often comprise tens to hundreds of functional units (FUs), which are capable of executing word-level operations instead of hit-level ones found in common field programmable gate arrays (FPGAs). This coarse granularity greatly reduces the delay, area, power and configuration time compared with FPGAs. On the other hand, compared with traditional “coarse-grained” programmable processors, their massive computational resources enable them to achieve high parallelism and efficiency. However, existing CGRAs have not yet been widely adopted mainly because of programming difficulty for such a complex architecture.