Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
In keeping with Moore's Law, the number of transistors that can be practicably incorporated into an integrated circuit has doubled approximately every two years. This trend has continued for more than half a century and is expected to continue until at least 2015 or 2020. However, simply adding more transistors to a single-threaded processor no longer produces a significantly faster processor. Instead, increased system performance has been attained by integrating multiple processor cores on a single chip to create a chip multiprocessor and sharing processes among the multiple processor cores of the chip multiprocessor. But even this approach has limitations.
With each successive process generation, the percentage of a chip that can actively switch drops due to limitations on threshold voltage scaling related to power use and heat dissipation. Thus, in a few process generations, chip multiprocessors will only be able to make use of a small fraction of a silicon die at full frequency at once. This “utilization wall” will prevent massively multi-core processors from effectively employing more than a small subset of cores at once, which undermines the utility of building high core-count processors. In addition, the expanded use of mobile computing devices makes the execution of complex code at minimum power highly desirable in multi-core processors.
Hardware accelerators offer the best solution to meet the demand for maximum performance using minimum power. A hardware accelerator generally includes separate logic circuits from the central processing unit of a computing device, and is used to perform certain functions faster than is possible in software running on a general-purpose central processing unit. To that end, hardware accelerators may be programmable to allow specialization to a particular task or function, and consist of a combination of software, hardware, and firmware. Typically, hardware accelerators are designed for computationally intensive software code, and can vary from a small functional unit, such as a floating-point accelerator, to a large functional block, such as a graphics processing unit.