Embedded applications perform specialized tasks and are designed by taking into consideration various parameters such as speed, cost, efficiency and robustness. With an increase in the complexity of embedded applications, low power consumption and flexibility in design have also emerged as parameters which must be taken into consideration while designing an embedded application. Typically, embedded applications comprising few processing instructions are implemented on hardware whereas embedded applications comprising a large number of processing instructions are implemented on microprocessors. In instances where speed of computation is critical for performance of an embedded application, the application may partly be implemented on hardware and partly as one or more software modules running on a microprocessor.
Depending on the number and nature of processing instructions, an embedded application may be designed either as a hardware implementation comprising field programmable gate array (FPGA)/application specific integrated circuits (ASICs), or a microprocessor based implementation or else as a hybrid design involving both microprocessor and hardware implementations. In the recent times multi-core system on a chip (SoC) are being widely used for implementing hybrid design of embedded applications. Multi-core SoCs comprise more than one central processing units (CPU) on a single chip.
There are certain limitations associated with each of the three design implementations of embedded applications. While hardware implementations may be optimized for speed, they are inflexible and are difficult to implement if the embedded application is complex in nature. Microprocessor based designs are flexible but are typically slower than hardware implementations. In order to increase the speed of computation in microprocessor based designs, clock speeds of the microprocessor have to be raised, which in turn results in greater power consumption.
The hybrid design of embedded applications is implemented in order to achieve a high speed of computation as well as flexibility in design by appropriate distribution of the processing instructions between hardware and software. Processing instructions which relate to tasks such as those requiring substantial CPU time, those that are critical to overall performance of the embedded application and those that are repeated often during processing of the embedded application, are typically implemented on hardware. However, distribution of the processing instructions between hardware and software often requires data to be moved across the hardware and software which affects the speed of computation. Further, in a hybrid design the modules of the embedded application that are implemented in hardware remain inflexible. A hybrid design employing multi-core SoCs is most suited for embedded applications that are amenable to pipelining. The use of multi-core SoCs leads to an achievement of greater performance levels at lower clock speeds. However, total power dissipated in such implementations is higher due to the presence of multiple CPUs.
As enormity and complexity of algorithms increase, flexibility becomes important. For portable devices, power considerations assume greater significance. Thus, from a power-flexibility stand point current approaches have limitations and hence there is need for a new approach that would achieve flexibility typical in software implementations with speeds (or close to it) of a hardware implementation.
Power consumption in an embedded application designed on a CMOS chip may be classified as static and dynamic. The dominant component of power consumption is dynamic power consumption and a first order approximation of dynamic power consumption is represented by the formula:P=A×C×F×V2   (1)where P denotes power, C denotes an effective switch capacitance, V denotes supply voltage, F denotes a frequency of operations and A denotes the number of switches per clock cycle.
For a typical design implementation of an embedded application if the voltage and underlying technology used in developing the required hardware is assumed to be constant, the power dissipated is directly proportional to frequency at which the hardware is run and the resources consumed by the hardware. The resources comprise switching transistors. A software design implementation of an embedded application comprises a sequence of instructions run on a microprocessor and therefore requires a higher value of F, whereas hardware design implementations typically run a number of operations in parallel and hence require a higher value of A. Since power is proportional to both F and A, it is a better measure of comparison across hardware and software implementations.
Experimental evidence suggests that for a generic embedded application, a software or microprocessor based implementation leads to greater power consumption than a hardware implementation of the same. Greater power consumption may be attributed to the flexibility offered by a microprocessor platform. Flexibility of microprocessor based implementations results from the following factors:                Instruction sets that support generic operations        Microprocessors are similar to finite state machines although number of states transitioned is much higher than that in a typical FPGA/ASIC implementation. The state transitions in microprocessors are controlled using “fetch and execute” model and hence are more generic.        Highly flexible movement of data between general purpose registers, arithmetic and logic unit (ALU) and memory.        Software (program code running on a microprocessor) determines the timing of movement of data and also the sequence of operations.Hence, a microprocessor provides a generic platform capable of running any kind of embedded application supporting design of flexible embedded applications. Hardware implementation on the other hand optimizes on power by being specific or inflexible.        
The power efficiency of a microprocessor based implementation or a software implementation may be improved by sacrificing generality of the microprocessor platform. This generality in architecture offered by a microprocessor based design platform enables the microprocessor to be used for implementing any kind of application across various domains. However, since an embedded application is typically designed for performing a specific task and requires a degree of flexibility in design, the complete generality offered by a microprocessor may not always be required. Since, generality offered by a microprocessor platform comes at the expense of power there is need for a system that would provide a trade off between the excess generality offered by a microprocessor platform and a gain in power efficiency.
Further, in recent times complex embedded applications are being pushed onto portable/hand-held/mobile devices. Such devices are required to perform complex computational tasks at low levels of power consumption in order to ensure that the higher processing power does not have an adverse impact on battery requirements. Hence, there is need for an embedded application design methodology that achieves flexibility such as is typical in software implementations with a speed of computation similar to that achieved via a hardware implementation.