Microprocessors frequently include hardware acceleration circuits targeted to specific applications. Such application-targeted hardware accelerators help improve the performance-per-Watt (GOPS/Watt) of general-purpose execution cores. Thus, a microprocessor designed for general purpose operation may be capable of performing a certain task, but will take longer and/or consume more power than having a specially designed circuit to perform the task. Such accelerators are generally triggered by the use of a special-purpose instruction included in the processor instruction set architecture (ISA). Thus, such special-purpose instructions trigger the general-purpose core to “offload” the execution of the task to the appropriate acceleration hardware, thereby providing performance-optimized, lower-latency, lower power operation for accelerating targeted workloads. One example is a so-called “POPCNT” (PopCount) instruction, which is an application-targeted instruction used to accelerate search operations involving large data sets. The PopCount instruction hardware “counts” or detects or calculates the number of set bits in a data object. Applications that benefit from this instruction include genome mining, handwriting recognition, digital health workloads, and fast hamming distance counts. The wide application of the instruction has made it a rather critical part of operation in modern-day search engines.
Other examples include BitScanForward (BSF) and BitScanReverse (BSR) instructions, which are used extensively in floating-point operations for rounding and normalization of floating-point numbers. These bit-scan instructions trigger bit detection operations that locate a particular bit of interest in an input word. Specifically, BSF returns the bit-position of the least-significant set bit (scans from the LSB (least significant bit) of the input word to the MSB (most significant bit)), and BSR returns the most-significant set bit (scanning from the MSB to the LSB) in the input word.
In current microprocessor implementations, each of these acceleration functions can be thought of as intrinsically tied to their hardware acceleration circuits. While these instructions and their associated hardware acceleration circuits can provide improved performance in a microprocessor, currently each additional hardware acceleration circuit increases the amount of integrated circuit (IC) “real estate” necessary to manufacture the microprocessor. Increased use of semiconductor area increases the size, cost, and power consumption of a microprocessor.
Descriptions of certain details and implementations follow, including a description of the figures, which may depict some or all of the embodiments described below, as well as discussing other potential embodiments or implementations of the inventive concepts presented herein. An overview of embodiments of the invention is provided below, followed by a more detailed description with reference to the drawings.